Lentivirus from the group of immunodeficiency viruses of drill monkeys (Mandrillus leucophaeus) and their use

ABSTRACT

The present invention relates to an immunodeficiency virus of drill monkeys, its RNA, the corresponding cDNA, proteins derived therefrom and fragments of the nucleic acids or proteins. The invention likewise relates to the diagnostic use of the nucleic acids and proteins mentioned and their fragments and to a diagnostic.

[0001] The present invention relates to the immunodeficiency virus SIM27 of drill monkeys, whose RNA or a part thereof is complementary to the sequence shown below, and variants of this virus. Moreover, the viral RNA, the corresponding cDNA, proteins derived therefrom and fragments of the nucleic acids or proteins are a subject of the present invention. The invention likewise relates to the diagnostic use of the mentioned nucleic acids and proteins and their fragments, and a diagnostic comprising these nucleic acids and/or proteins and/or fragments thereof.

[0002] Primates have been developing for approximately 30 million years, which has lead to a high degree of variability of the individual primate species The New World monkeys (Platyrrhini) are differentiated from the Old World monkeys (Catarrhini), which for their part are divided into the hominoids (Hominoidae) and the cercopithecoids (Cercopithecoidea). Together with the primates, various infective agents have also developed, which have adapted to the individual primate species or, for example, to a whole family. Examples of virus are the simian pathogenic and the human pathogenic herpesviruses, which although they can still infect individuals of another primate species, are naturally not transmitted from one primate species to the other. Other viruses still infect all primates, such as the rabies virus, the yellow fever virus and the filovirus.

[0003] Lentiviruses are subdivided into the genera of the spume viruses, the T-leukemia/lymphoma viruses and the immunodeficiency viruses. A general survey of the leukemia and immunodeficiency viruses of the monkeys and their pathogenicity is found in the article of Hayami (Hayami M et al., Curr. Top. Microbiol. Immunol. 1994; 188: 1-20). Spume viruses appear to occur only in monkeys. Since until now a pathogenicity of the spume viruses has not been detected, this virus is being less intensively investigated than HIV/SIV and HTLV/STLV.

[0004] HTLVs, the human T-leukemia viruses type I and type II, are structurally very similar to STLVs, the simian (monkey) T-lymphoma viruses (Franchini et al., AIDS Res Human Retrovirus 1994; 10: 1047-1060). Thus the difference in the virus species, STLV I and II, and the viruses between man (HTLV) and monkeys (STLV) is a sign of a long individual evolution in the individual primates, if a cross-transmission between the various primate species can be excluded (Franchini et al., AIDS Res Human Retrovirus 1994; 10: 1047-1060). STLV-infected monkeys occur over the entire world (Hayami M et al, Curr. Top. Microbiol. Immunol. 1994; 188: 1-20), whereas SIV-infected monkeys are only to be found naturally in Africa, which- is an indication of the fact that SIV very probably developed later than STLV.

[0005] Molecular biology results show clearly that HIV-1 is very closely related to the immunodeficiency viruses of the chimpanzee. The latter viruses are subsequently designated as SIV-1, whereas the virus of the mangabeys, SIVsm, is designated as SIV-2. SIV-1 and HIV-1 derive with high probability from a precursor virus, just as SIV-2 and HIV-2 probably have a common precursor. Up to 25% of troops monkeys can naturally be infected with SIV-2 without signs of the virus pathogenesis being detectable in the infected animals (Chen Z et al., J Virol. 1996; 70: 3617-3627). In the case of SIV-2, infections in man were detected which do not differ in their pathogenesis from an HIV-2 infection. SIV-2 is closely related to HIV2 and particularly epidemically widespread in West Africa south of the Sahara, in the same region in which the mangabeys live (Gao F L et al., Nature 1992; 358: 495-499). The results of the investigations on SIV show that in addition to the SIV-2 (SIVsm) of mangabeys the immunodeficiency viruses of the African green meerkat represent a further type, perhaps SIV-3, and in addition meanwhile some further simian SIVs have been isolated which cannot be assigned to the groups of viruses mentioned and which probably represent the SIV type 4. This SIV-4 type is formed by the viruses of the Sykes monkeys (Cercopithecus mitis), the Hoest monkeys (Cercopithecus l'hoesti) (Hirsch V M et al., J. Virol. 1999; 73: 1036-1045), the red cap mangabeys (Cercopithecus torquatus torquatus) (Georges-Courbot M C et al., J. Virol. 1998; 72: 600-608), the mandrill monkeys SIVmnd (Mandrillus sphinx) (Tsujimoto H et al., Nature 1989; 341: 539-541), and the drill monkeys (Mandrillus leucophaeus) (Clewley J P et al., J. Virol. 1998; 72: 10305-10309). All previously isolated SIV-4s can be cultured in human peripheral blood lymphocytes and some in the human permanent cell line Molt4 clone 8 (Hirsch V M et al., J. Virol. 1999; 73: 1036-1045), which indicates that the infection of man with these viruses should also be possible. The SIV-4 type is so different from the SIV-2 type that an SIVmac(SIV2)-specific p25 antigen test cannot detect SIVhoest(SIV4) produced in the supernatant of infected cells (Hirsch V M et al., J. Virol. 1999; 73: 1036-1045), as the Gag region is too divergent for recognition by monoclonal antibodies. The phylogenetic comparison of the nucleic acid sequences of the simian viruses also shows that the SIV-4 described here differs from SIV-2 and SIV-3 (Korber et al. Human Retroviruses and AIDS 1997. A compilation and analysis of nucleic acid and amino acid sequences. Los Alamos National Laboratory, New Mexico, 1998).

[0006] As described above, a virus similar to SIVcpz is possibly the precursor virus of viruses causing human HIV-1 infections, which the high similarity of viruses of the group HIV1-M, -N and -O to SIV-1 indicates.

[0007] To date, there are no reports that humans have been infected with SIV-4. A nosocomial infection with SIV-3 or SIV-2 occurred due to contamination of the eczematous skin of a laboratory assistant (Khabbaz R F et al., N. Engl. J. Med. 1994; 330: 172-177). The SIV replicated for a certain time which was sufficient for the induction of a strong antibody response, but was not sufficient to establish a permanent infection (Khabbaz R F et al., N. Engl. J. Med. 1994; 330: 172-177). About 3.5 years after seroconversion, the laboratory assistant appeared to be free of the infection (Khabbaz R F et al., N. Engl. J. Med. 1994; 330: 172-177). Whether this path of virus elimination is the rule or whether persistent infections with corresponding pathogenesis can also result from the infective event is unknown.

[0008] Since until now no epidemiological studies on target groups in central Africa have been carried out which can show whether variant viruses such as SIV-4 also circulate in the human population, infection of man cannot be confirmed, but can also not be excluded.

[0009] As was seen in the example of the HIV-1 subtype O, antibody detection tests on the basis of HIV-1 subtype M were not sufficiently reactive in order to be able to detect all subtype O-infected patients (Simon F et al. AIDS 1994; 8: 1628-1629). The diagnosis of an infection with an aberrant human pathogenic SIV subtype could probably also not be made, as it must be assumed that the ELISA exploratory tests based on HIV-1 and/or HIV-2 antigens are negative or would only be slightly reactive, and the attempt at confirmation by means of the immunoblot produced a negative or probably questionable result. The diagnosis could probably also not be made by means of the nucleic acid tests, since with the presently available tests, for example, neither the nucleic acid of the viruses of group O nor that of HIV-2 can be reliably amplified (G{overscore (u)}rtler L et al., 12th World AIDS Conference Geneva Basic Science 1: 121-124).

[0010] The drill monkeys described here (Mandrillus leucophaeus) are animals which originate from the western region of Cameroon bordering Nigeria and live wild there in the bushland. Drill monkeys have become widespread in the central West-African region. The animals are hunted and eaten, which is why the stock in recent years has continuously decreased. Young animals are in some cases picked up and kept in the vicinity of the houses as pets. The monkey 27 described here (3 years old) was captured from a free hunting reserve and then domesticated over the course of a year and has had no contact with other monkeys of the same or of a similar species.

[0011] As described in Example 2, the virus originating from monkey 27 was replicated in human PBLs. Genomic DNA and thus also integrated proviral DNA of the SIV was isolated from the infected cells. The deciphering of the sequence of the total genome of the SIV is described in Example 3. The PCR (polymerase chain reaction) method was employed for the multiplication of the viral DNA. The components needed for carrying out the process can be acquired commercially.

[0012] Using this process, it is possible to amplify DNA sequences if DNA regions of the sequence to be amplified are known, or known sections are sufficiently similar. Short complementary DNA fragments (oligonucleotides=primers) which add to a short region of the nucleic acid sequence to be amplified must then be synthesized. For carrying out the test, nucleic acids are combined with the primers in a reaction mixture which additionally contains a polymerase and nucleotide triphosphates. The polymerization (DNA synthesis) is carried out for a specific time, then the nucleic acid strands are separated by warming. After cooling, the polymerization starts again.

[0013] The amplified genome sections were sequenced by the Sanger method. As described in Example 4, the genome of SIM27 was subjected to phylogenetic comparisons which showed that it is a strongly divergent novel simian immunodeficiency virus.

[0014] The present invention therefore relates to:

[0015] 1.) Immunodeficiency viruses which branch off as a side branch from the SIM27 side branch after the branching of SIM27 in a phylogenetic investigation of their total genome on the nucleic acid plane, as is described in Example 4 (see FIG. 1)

[0016] 2.) GAG proteins and fragments thereof which branch off as a side branch from the SIM27 side branch after the branching of SIM27 in a phylogenetic investigation of their total sequence on the amino acid plane, as is described in Example 4 (see FIG. 2).

[0017] 3.) Pol proteins and fragments thereof which branch off as a side branch from the SIM27 side branch after the branching of SIM27 in a phylogenetic investigation of their total sequence on the amino acid plane, as is described in Example 4 (see FIG. 4), or a POL protein fragment or subfragments thereof which branch off as a side branch from the SIM27 side branch after the branching of SIM27 in the region of the sequence including this amino acid sequence, published by Clewley (Clewley J P et al., J. Virol. 1998;

[0018]72: 10305-10309), as has been investigated as described in Example 4 (see FIG. 6).

[0019] 4.) ENV proteins and fragments thereof which branch off as a side branch from the SIM27 side branch after the branching of SIM27 in a phylogenetic investigation of their total sequence on the amino acid plane, as is described in Example 4 (see FIG. 7).

[0020] Of particular interest is furthermore the consideration of the strongly immunogenic cysteine loop region in the Env gene, which is therefore of particular diagnostic importance. The cysteine loop regions of various immunodeficiency viruses are shown in Table 1. TABLE I SIM27.ENV RLTALEEYVADQSRLAVWG CSFSQVC HTNVKW SIV-Mandrill, MNDGBI RLTSLENYIKDQALLSQWG CSWAQVC HTSVEW HIV1-N, YBF30 KVLAIERYLRDQQILSLWG CSGKTIC YTTVPW HIV1-C, 96bw05.02 RTLAVERYLKDQQLLGIWG CSGKLIC TTAVPW HIV1-O, ANT70C RLLALETLLQNQQLLSLWG CKGKLVC YTSVKW SIV-CPZ, CPZGAB RLLAVERYLQDQQILGLWG CSGKAVC YTTVPW HIV1-O, MVP5180 RLQALETLIQNQQRLNLWG CKGKLIC YTSVKW SIV-lhoesti RLTALEEYVKHQALLASWG CQWKQVC HTNVEW SIV-SYKES RLTALETYLRDQATLSNWG CAFKQIC HTAVTW SIV-CPZ, CPZANT RMLAVEKYLRDQQLLSLWG GADKVTC KTTVPW SIV-CPZ-US RVLAVERYLKDQQILGLWG CSGKTTC YTTVPW HIV1-F, 93br020.1 RVLAVERYLKDQQLLGLWG CSGKLIC TTNVPW HIV1-A, 92ug037 RVLAVERYLRDQQLLGIWG CSGKLIC PTNVPW HIV1-H, 90cr056 RVLAVERYLRDQQLLGTWG CSGKLIC TTNVPW HIV1-D, NDK RVLAVERYLRDQQLLGIWG CSGRHIC TTNVPW HIV2-B, UC1 RVTAIEKYLKDQALLNSWG CAERQVC HTTVPW SIV-D, MNE RVTAIEKYLXDQAQLNAWG CAERQVC HTTVPW SIV-D, MM239 RVTAIEKYLKDOAQLNAWG CAFRQVC HTTVPW SIV, SME543 RVTAIEKYLKDQAQLMSWG CAFRQVC HTTVPW SIV-D, SMM-PBJ-6P9 RVTAIEKYLKDQAQLNSWG CAERQVC HTTVPW SIV-D, STM RVTAIEKYLKDQAQLNSWG CAERQVC HTTVPW HIV2-A, CAM RVTAIEKYLKDQAQLNSWG CAERQVC HTTVPW HIV2-A, GH1 RVTAIEKYLKDQAQLNSWG CPFRQVC HTTVPW HIV2-B, EHO RVTAIEKYLKDQAQLNSWG CAFRQVC HTTVPW SIV-SMM, PGM RVTAIEKYRKDQAQLNSWG CAFRQVC HTTVPW SIV-VERVET, AGM155 RVTALEKYLAZQARLNAWG CSWKQVC HTTVPW SIV-VERVET, AGM3 RVTALEKYLEDQARLNAWG CAWKQVC HTTVPW SIV-SABAEUS, AGMSAB1 RVTALEKYLEDQARLNIWG CAFRQVC HTTVLW SIV-VERVET, AGMTY6 RVTALEKYLEDQARLNSWG CAWKQVO HTTVEW SIV-GRIVET, AGM677A RVTALEKYLEDQARLNSWG CAWKQVC HTTVPW SIV-VERVET, REV RVTALEKYLEDQARLNVWG CAWKQVC HTTVPW SIV-TANTALUS, TAN1 RVTALEKYLEDQTRLNLWG CAFKQVC HTTVPW

[0021] As can be clearly seen, either lysine or arginine occurs in position 3 of the cysteine loop (C12345C) in nearly all representatives of immunodeficiency viruses. The only exception up to now was found in the immunodeficiency virus MNDGB1, which was likewise isolated from a drill monkey (Mandrillus spinx). With great probability it is to be assumed from this that antibodies formed against this modified epitope cannot be recognized or can be recognized with clearly decreased efficiency from diagnostic tests known up to now which are based on the customary arginine- or lysine-containing antigens.

[0022] This invention therefore likewise relates to antigens in which arginine and/or lysine within the cysteine loop region in position 3 has been replaced by any desired amino acid, particularly preferably a polar amino acid such as serine or an amino acid having an aliphatic side chain such as alanine.

[0023] The present invention is moreover described in the examples and in the patent claims, where the examples serve for summarization and no restriction of the present invention must be derived therefrom.

EXAMPLE 1

[0024] Identification of the SIM27 infection in drill monkeys

[0025] In the course of a study, EDTA blood was taken from drill monkeys in the villages of rural Cameroon, in which they were kept, and this was analyzed in various HIV tests. On testing the serum of the monkey SIM27 for antibodies, a competitive ELISA for HIV-1 was negative and an ELISA from Dade Behring (Enzygnost HIV-1/2 plus) recognizing HIV-1, -2 and -O was likewise negative, the extinction lying near the threshold value. In the analysis of the HIV-1 Western blot (virus MVP899-87) which was carried out at the same time, no virus-specific bands were to be seen, in the HIV-2 blot (virus MVP11971-87), the band gp36 was to be seen strongly, and the bands p55 and p68 were to be seen, and in the HIV-1 group O blot (virus MVP5180-91), the bands p24 and p55 were to be seen. Gp36 is the transmembrane protein of HIV-2, the bands p55 and p68 correspond to the reverse transcriptase (p55) plus the RNaseH (p68) of HIV-2, and p24 is the inner core protein of HIV-1 group O viruses and p55 the precursor protein of gag and thus also p24. 20 ml of plasma from the animals were employed in order to develop the Western blot. According to the analysis of the nucleic acid sequence, the virus MVP11971-87 is a representative of the group HIV-2A, the virus MVP899 a representative of HIV-1B.

[0026] The SIV infection of the monkeys with the drill virus is thus distinguished:

[0027] by negativity in normal screening ELISAs for HIV antibodies,

[0028] by serological cross reaction in the env and pol region with the HIV-2 transmembrane glycoprotein and the reverse transcriptase in the Western blot,

[0029] by serological cross reaction in the gag region with the inner core protein of HIV-1 group O and absent cross reaction with the core proteins of group M (HIV-1B) in the Western blot.

EXAMPLE 2

[0030] Isolation of the SIM27 virus

[0031] The lymphocyte fraction was isolated by Ficoll gradient centrifugation from 5 ml each of EDTA blood of the monkeys. The lymphocytes were stimulated with PHA (phytohemaglutinin, 5 mg/ml) and PMA (myristylphorbol ester, 10 ng/ml), after 3 days both additives were washed out and the culture was continued in the presence of RPMI-1640, as usual, with interleukin-2 addition. The PMA stimulation was described by Kubo et al. (Kubo M et al., J. Virol 1997; 71: 7560-7566).

[0032] The culture conditions were similar to those which have been described by Tamalet et al. (Tamalet, C. et al., AIDS 1994; 1083-1088). After one week in culture, human PHA-stimulated and nonstimulated blood lymphocytes (PBLs) were added to the monkey lymphocytes and the addition was repeated once weekly until it was possible after about 3 weeks to detect beginning SIV production by means of a commercially obtainable p24 antigen test (Abbott, Wiesbaden).

[0033] The virus was then subcultured on human lymphocytes from the supernatant of the cells. All attempts to transfer the SIM27 to permanent culture cells such as HUT-78 or Jurkat have failed up to now. By means of monthly subculturing, it was possible to keep SIM27 on PBL in culture for 9 months from then on.

EXAMPLE 3

[0034] DNA isolation, amplification and structural characterization of genome sections of the HIV isolate SIM27

[0035] Genomic DNA from SIM27-infected blood lymphocytes was isolated by standard methods (Current Protocols in Molecular Biology, Wiley Interscience, 1994).

[0036] The total genome was amplified exclusively by means of PCR (polymerase chain reaction). All PCRs were begun by means of “Hot Start”: after addition of all components of the PCR, except the polymerase, this was added only after heating the sample to 94° C., which strongly reduces the extension of nonspecifically binding primers.

[0037] A general survey of the individual stages of the deciphering of the genome is shown in FIG. 8.

[0038] For the characterization of genome regions of the isolate SIM27, PCR experiments were carried out with primer pairs from the region of the integrase in the pol gene. The PCR (Saiki et al., Science 239: 487-491, 1988) was modified as follows:

[0039] For the first amplification of HIV-specific DNA regions, 5 μl (200 μg/ml) of genomic DNA from SIM27-infected blood lymphocytes were pipetted into a 50 μl reaction mixture (0.25 mM DNTP, 1 μM each primer, 10 mM tris HCl, pH 8.3, 50 mM KCl, 1.5 mM MgCl₂, 0.001% gelatin, 2.5 units platinum-Taq DNA polymerase (Gibco)) and amplified according to the following temperature program:

[0040] 1) initial denaturation: 3 min. 95° C,

[0041] 2) amplification: 30 sec. 94° C., 30 sec. 49° C., 30 sec. 68° C. (30 cycles).

[0042] The primers used for the PCR had the following sequence:

[0043] (Seq. ID No. 1 and 2)

[0044] 5′Spol2380agm GCC ATG TGT CCA AAA TGT CA 3pol2930agm CTT CTC TGT AGT AGA CTC TA

[0045] 5 μl of the amplificate were employed as a template for a second nested PCR with the following primers and the same temperature profile:

[0046] (Seq. ID No. 3 and 4)

[0047] Spol2460agm TAG TAG CAG TCC MYR KWG (M=A/C, Y=C/T, R=A/G, K=G/T, W=A/T) 3pol2760agm TCT CTA ATT TGT CCT ATG AT

[0048] The amplificate thus obtained was sequenced directly without cloning.

[0049] The sequence found is shown in Table 2. TABLE 2 1 AGTAGCAGTC CATGTAGCCA GTGGATACCT AGAGGCAGAA GTAATACCAG 51 CAGAGACAGG AAAAGAGACA GCACATTTCC TGTTAAAGTT AGCAGGCAGG 101 TGGCCTGTAA AACATTTAGA CACTGACAAT GGCCCCAACT TTGTCAGTGA 151 AAAGGTAGCC ACAGTCTGTT GGTGGGCTCA AATAGAGCAC ACCACAGGTG 201 TACCCTATAA CCCCCAGAGT CAGGGAGTAG TGGAAGCAAA GAATCATCAT 231 CTTAAGACAA TCATAGGACA AATTAGAGA

[0050] Based on the publication of Clewley (Clewley J P et al., J. Virol 1998; 72: 10305-10309), a further amplificate was obtained in the 5′ region of the pol gene. The primers DR1, DR2 and, for the nested PCR, DR4 and DR5 described by Clewley were used, as well as the temperature cycles described in this publication. The polymerases used were DNA-Taq polymerase (Perkin Elmer) and the buffers described above.

[0051] The sequence according to Table 3 was obtained here: TABLE 3 1 GGGATTCCGC ANCCGGCAGG TCTAAAACAA TGTGAACAGA TCAGAGTATT 51 GGATATAGGA GATGCCTATT TTTCATGCCC ATTGGATGAG GACTTTAGAA 101 AGTATACTGC ATTCACCATT CCATCGGTGA ATAATCAGGG GCCCAGGAAT 151 CAGATACCAG TATAATGTCC TCCCNCAGGG NTGGAAGGGG TCCCC

[0052] In a next amplification, the region of SIM27 lying between the amplificates already obtained was amplified. The primers mentioned below were used here.

[0053] For the first PCR:

[0054] (Seq. ID No. 5 and 6)

[0055] 1216 ATG CCC ATT GGA TGA GGA C 1197 GAC TGT GGC TAC CTT TTC ACT

[0056] For the nested PCR:

[0057] (Seq. ID No. 7 and 8)

[0058] 1218 CAT CGG TGA ATA ATC AGG 1226 GGT ATT ACT TCT GCC TCT A

[0059] The platinum-Taq DNA polymerase (Gibco) was used according to the following temperature program:

[0060] 1) initial denaturation: 2 min. 95° C.,

[0061] 2) amplification: 30 sec. 95° C., 30 sec. 55° C., 150 sec. 68° C. (30 cycles).

[0062] The sequence according to Table 4 was obtained here. TABLE 4 1 CATCGGTGAA TAATCAGGGC CCAGGAATCA GATACCAGTA TAATGTCCTC 51 CCACAGGGAT GGAAAGGCTC TCCAGCAATT TTTCAGGCAA CAGCTGATAA 101 AATCTTGAAA ACATTCAAAG AAGAATACCA GAGGTATTAA TTTATCAGTA 151 TATGGATGAT CTGTTCGTGG GAAGTGACTT AAATGCCACT GAACATAACA 201 AAATGATAAA CAAGTTGAGA GAGCATCTGA GATTCTGGGG GCTCGAGACC 251 CCAGATAAGA AGTTTCAAAA GGAACCTCCT TTTGAATGGA TGGGATATGT 301 GCTACACCCA AAGAAATGGA CAGTGCAGAA AATACAACTA CCAGAAAAAG 351 AGCAATGGAC AGTGAATGAT ATTCAGAAAT TGGTAGGAAA ACTTAATTGG 401 GCAAGTCAGA TATATTCCGG AATTAAAACA AAAGAGCTCT GTAAATTGAT 451 CAGAGGAGGA AAACCTCTAG ATGAAATAGT AGAATGGACA AGAGAAGCAG 501 AATTAGAGTA TGAAGAGAAT AAGATAATAG TGCAGGAGGA GGTGCATGGA 551 GTGTACTATC AGCCAGAAAA ACCACTGATG GCAAAAGTAC AAAAGTTGAC 601 ACAAGGACAG TGGAGTTATC AAATAGAGCA AGAAGAAAAC AAACCTCTCA 651 AGGCAGGAAA ATATGCCAGG ACAAAGAATG CCCACACAAA TGAGTTAAGG 701 ACACTTGCAG GGTTAGTACA AAAAATAGCC AAGGAATGCA TAGTAATCTG 751 GGGAAGATTG CCAAAATTTT ACCTCCCCTT GGAGAGAGAA GTATGGGATC 801 AATGGTGGCA TGATTATTGG CAGGTAACAT GGATCCCAGA GTGGGAATTC 851 ATCTCAACAC CACCATTGAT AAGGCTATGG TACAACCTCC TGAAAGAACC 901 AATTCCAGGA GAAGATGTAT ACTATGTACA TGGGGCAGCT AACAGAAATT 951 CTAAAGAAGG CAAGGCAGGA TACTATACAG CAAGGGGCAA AAGTAAGGTA 1001 ATAGCTTTAG AAAATACAAC CAATCACAAG GCAGACCTGA AGGCAATAGA 1051 ATTAGCCCTA AAAGATTCAG GACCAAGAGT AAACATAGTA ACAGATTCAC 1101 AGTATGCATT AGGCATACTC ACAGCATCCC CAGATCAGTC AGATAACCCC 1151 ATAGTTAGGG AAATAATTAA CCTCATGATA GCCAAGGAAG CAGTCTACCT 1201 GTCATGGGTA CCAGCCCACA AGGGTATAGG AGGTAACGAA CAAATAGACA 1251 AATTAGTAAG CCAAGGAATT AGGCAAGTAC TATTCCTGGA AGGAATAGAC 1301 AGAGCTCAGG AAGAACACGA CAAATATCAT AACAACTGGA GAGCTTTAGC 1351 TCACGAATTC AGCATACCTC CTATAGTGGC AAAAGAGATA GTTGCACAAT 1401 GCCCAAAATG CCAGATAAAA GGGGAACCTA TTCATGGCCA GGTAGATGCA 1451 AGTCCTGGGA CATGGCAAAT GGATTGCACC CATCTAGAAG GAAAGGTCAT 1501 CATAGTGGCA CTCCATGTAG CCAGTGGATA CCTACAGGCA GAAGTAATAC 1551 C

[0063] The region of the total sequence of the 5′-LTR region of the genome up to the pol gene was amplified with the following primer pairs:

[0064] 1. PCR:

[0065] (Seq. ID No. 9 and 10)

[0066] 1248 CTC AAT AAA GCT TGC CTT GA 1217 GTC CTC ATC CAA TGG GCA T

[0067] 2. Nested PCR:

[0068] (Seq. ID No. 11 and 12)

[0069] 1249 TRD CTA GAG ATC CCT CAG A (R=A/5, D=G/A/T) 1219 CCA ATA CTG TGA TCT GTT CAC

[0070] The platinum-Taq DNA polymerase (Gibco) was in each case used according to the following temperature program:

[0071] 1) initial denaturation: 2 min. 95° C.,

[0072] 2) amplification: 30 sec. 95° C., 30 sec. 50° C., 180 sec. 68° C. (30 cycles). 1× enhancer (Gibco) was used in addition to the buffers indicated above.

[0073] The sequence according to Table 5 was obtained here: TABLE 5 1 TRDCTAGAGA TCCCTCAGAT TTGTGCCAGA CTTCTGATAT CTAGTGAGAG 51 TAGAGAAAAA TCTCCAGGAG TGGCGCCCGA ACAGGGACTT GACGAAGAGC 101 CAAGTCATTC CCACCTGTGA GGGACAGCGG CGGCAGCGRG CCGGACCGAC 151 CCACCCGGTG AAGTGAGTTA ACCAAGGAGC CCCGACGCGC AGGACACAAG 201 GTAAGCGCTG CACCGTGCTG TAGTGAGTGT GTGTCCAGGA TCCGCTTGAG 251 CAGGCGAGAT CGCCGAGGCA ACCCCACTAG AAAAAGAAAA GAGGGGAAGT 301 AAGGCCGAGG CAAAGTGAAA GTAAAAGAGA TCCTCTGAGA AGAGGAACAG 351 GGGGCAATAA AATTGGCGCG AGCGCGTCAG GACTTAGGGG AAGAGAATTG 401 GATGAGCTGG AAAAGATTAC GTTACGGCCC TCCGGAAAGA AAAAATACCA 451 GCTAAAACAT GTGATATGGG TAAGCAAGGA ACTAGATAGA TTTGGCCTAC 501 ATGAAAAGTT GTTAGAAACC AAGGAAGGAT GCGAAAAAAT TCTTAGCGTA 551 CTCTTTCCTC TAGTTCCTAC AGGGTCAGAA AATTTAATTT CGCTGTACAA 601 CACCTGCTGT TGCATTTGGT GCCTACATGC GAAAGTGAAA GTAGCAGATA 651 CAGAAGAGGC AAAAGAGAAA GTAARACAAT GCTACCATCT AGTGGTTGAA 701 AAACAGAATG CAGCCTCAGA AAAAGAAAAA GGAGCAACAG TGACACCTAG 751 TGGCCACTCA ARAAATTACC CCATTCAGAT AGTAAATCAA ACCCCAGTAC 801 ACCAGGGAAT TTCTCCCAGA ACACTGAATG CTTGGGTAAA ATGTATAGAG 851 GAGAAGAAAT TCAGCCCAGA AATAGTGCCT ATGTTCATAG CTTTGTCAGA 901 AGGATGCCTC CCATACGACC TCAACGGCAT GCTCAATGCC ATTGGGGACC 951 ATCAGGGAGC TCTCCAAATA GTGAAAGATG TCATCAATGA CGAAGCTGCA 1001 GACTGGGATC TTAGACATCC TCAGATGGGG CCTATGCCCC AAGGGGTGCT 1051 AAGAAACCCA ACAGGGAGTG ACATAGCAGG AACCACCAGC AGCATAGAAG 1101 AACAAATTGA ATGGACAACT AGGCAGCAAG ATCAGGTAAA TGTAGGAGGA 1151 ATTTACAAAC AATGGATAGT TCTGGGATTG CAAAAATGTG TGAGCATGTA 1201 CAATCCAGTG AATATTCTAG ATATAAAACA GGGACCAAAA GAACCCTTTA 1251 AGGACTATGT GGATCGATTT TACAAAGCTC TGCGGGCGGA GCGAACAGAT 1301 CCACAAGTGA AAAACTGGAT GACGCAGACA TTGCTCATCG AGAATGCAAA 1351 CCCAGATTGT AAAGCCATTC TTAAGGGATT AGGCATGAAC CCCACCTTGG 1401 AAGAAATGTT ATTGGCATGT CAAGGAGTAG GGGGACCAAA GTATAAAGCT 1451 CAAATGATGG CAGAAGCAAT GCAGGAGGTG CAAGGAAAAA TTATGATGCA 1501 AGCCTCGGGA GGACCACCGC GGGGTCCCCC AAGGCAGCCA CCCAGAAATC 1551 CTAGATGCCC CAACTGTGGA AAGTTTGGAC ATGTACTGAG AGACTGTAGA 1601 GCCCCAAGAA AGCGAGGATG CTTCAAGTGT GGAGATCCAG GACATCTGAT 1651 GAGAAACTGC CCAAAGATGG TGAATTTTTT AGGGAATGCT CCYTGGGGCA 1701 GTGGCAAAGC GAGGAACTTT CCTGCCGTGC CACTGACCCC AACGGCACCC 1751 CCGATGCCAG GATTAGAGGA YCCAGCAGAG ARGATGCTRC TGGATTACAT 1801 GAAGAAGGGG CAACAGATGA AGGCAGAGAG GGAAGCCAAA CGGGAGAAGG 1851 ACAAAGGCCC TTACCAGGCG GCTTACAACT CCCTCAGTTC TCTCTTTGCA 1901 ACAGACCAAC TACAGTAGTA GAGATAGAGG GGCAAAAAGT GGAGGCCCTA 1951 CTAGATACAG GAGCAGATGA CACAGTAATC AAAGATTTAC AATTAACAGG 2001 CAATTGGAAA CCACAAATCA TAGGAGGAAT TGGAGGAGCA ATTAGGGTAA 2051 AGCAATATTT CAATTGTAAA ATAACAGTGG CAGGTAAAAG CACTCATGCT 2101 TCAGTACTAG TGGGCCCCAC TCCTGTAAAT ATTATAGGTA GAAATGTACT 2151 TAAAAAGTTA GGATGTACTT TGAACTTCCC TATTAGTAAR ATAGAAACAG 2201 TAAAGGTAAC ACTAAAACCA GGAACTGATG GACCAAGAAT CAAACAGTGG 2251 CCACTGTCTA AAGAAAAGAT TTTAGCCTTA CAAGAAATAT GCAATCAGAT 2301 GGAAAAAGAA GGCAAAATCT CTAGAATAGG TCCAGAAAAT CCTTACAACA 2351 CACCAGTGTT TTGTATAAAA AAGAAAGATG GAGCCAGCTG GAGAAAACTG 2401 GTAGATTTTA GACAATTGAA TAAAGTGACA CAGGATTTCT TTGAGGTGCA 2451 GCTAGGAATC CCACATCCTG GAGGTCTAAA ACAATGTGAA CAGATCACAG

[0074] The still missing region of the total sequence of the integrase up to the 3′-LTR was amplified with the following primer pairs, the primer 1270 being discarded on account of the sequence of the 5′LTR region (prior amplificate):

[0075] 1. PCR:

[0076] (Seq. ID No. 13 and 14)

[0077] 1246 CCT ATT CAT GGC CAG GTA 1270 GAT TTT TCT CTA CTC TCA CTA

[0078] 2. Nested PCR:

[0079] (Seq. ID No. 15 and 16) 1196 AGT GAA AAG GTA GCC ACA GTC 1270 GAT TTT TCT CTA CTC TCA CTA

[0080] The platinum-Taq DNA polymerase (Gibco) was in each case used according to the following temperature program:

[0081] 1) initial denaturation: 2 min. 95° C.,

[0082] 2) amplification: 30 sec. 95° C., 30 sec. (47° C. 1.PCR; 51° C. 2.PCR), 360 sec. 68° C. (30 cycles). 1× enhancer (Gibco) was used in addition to the buffers indicated above.

[0083] The sequence according to Table 6 was obtained here: TABLE 6 1 AGTGAAAAGG TAGCCACAGT CTGTTGGTGG GCTCAAATAG AGCACACCAC 51 AGGTGTACCC TATAACCCCC AGAGTCAGGG AGTAGTGGAA GCAAAGAATC 101 ATCATCTTAA GACAATCATA GAACAAGTTA GGGATCAAGC AGAAAAATTA 151 GAAACAGCAG TACAAATGGC ASTATTAATA CACAATTTTA AAAGAAAAGG 201 GGGGATAGGG GAGTATAGTC CAGGAGAAAG AATAGTAGAT ATCATAACCA 251 CAGACATTCT AACAACTAAA TTACAACAAA ATATTTCAAA AATTCAAAAT 301 TTTCGGGTTT ATTACAGAGA AGGAAGGGAT CAACAGTGGA AAGGACCAGC 351 AGAACTCATT TGGAAAGGAG AAGGCGCTGT GGTGATTAAA GAAGGGACAG 401 ACTTAAAGGT GGTACCAAGA AGAAAAGCCA AAATCATCAG AGATTATGGA 451 AAAGCAGTGG ATAGTAATTC CCACATGGAG AGTAGAGAGG AATCAGCTTG 501 AGAAATGGAA TTCATTAGTA AAATATCATA AATATAGGGG AGAAAAATAC 551 CTAGAAAGAT GGGAACTATA CCACCATTTC CAATGCTCGG GGTGGTGGAC 601 ACACTCTAGA AAAGATGTTT ACTTTAAAGA TGGCTCAGTA ATAAGCATTA 651 CTGCCTTCTG GAATCTTACC CCAGAGAAAG GATGGTTGTC TCAATATGCA 701 GTTACAATAG AATATGTAAA AGAAAGCTAT TATACTTACA TAGACCCAGT 751 TACAGCAGAC AGAATGATTC ATTGGGAATA TTTCCCATGT TTTACAGCCC 801 AGGCTGTGAG AAAAGTACTG TTTGGAGAAA GACTAATAGC TTGCTACAGC 851 CCCTGGGGAC ACAAAGGACA GGTAGGGACT CTACAATTCC TGGCTTTGCA 901 AGCTTACCTT CAGTATTGTA AACATGGCAG AAAGAGCACC AGAAGTGCCG 951 GAAGGGGCAG GAGAGATACC TCTAGAACAG TGGCTAGAAA GATCATTAGA 1001 ACAACTCAAC AGAGAGGCCC GGTTACACTT CCACCCAGAG TTCCTTTTCC 1051 GTCTTTGGAA CACTTGTGTA GAACATTGGC ATGATAGACA CCAGAGGAGC 1101 CTGGAGTATG CAAAATACAG ATATCTTTTG TTGGTGCATA AGGCCATGTT 1151 TACCCATATG CAACAGGGAT GCCCATGTAG AAATGGGCAC CCAAGAGGAC 1201 CTCCTCCTCC AGGATTGGCC TAATTTCTGT CTTGCAGATG GAACAGCCAC 1251 CTGAGGACGA GGCTCCACAG AGAGAACCTT ATAATGAATG GCTGATAGAT 1301 ACCTTGGCAG AAATCCAGGA AGAAGCTTTG AAGCATTTTG ATAGGCGCTT 1351 GCTACATGCA GTAGGCTCAT GGGTGTATGA GCAACAGGGA GACACCTTAG 1401 AAGGTGTCCA AAAGCTAATA ACTATTCTAC AAAGAGCTTT GTTTTTGCAC 1451 TTCAGGCATG GATGCAGGGA AAGCCGCATT GGACAAGCAG GAGGGAAATA 1501 TAATTCCCTC AGATCCTTTC CAAGGCCAGA CAACCCCTTG TAATAAATGC 1551 TATTGTAAAA GATGTTGCTA TCACTGCCAG TTATGCTTCT TGCAGAAAGC 1601 CTTAGGGATA CATTATCATG TCTACAGAGT CAGGAGACCT CGACAGAGAT 1651 TTTTGGGCGA AGTACCACCA CATAGTGCAG CAACTGTGGA AAGGTAAGTA 1701 AAAAGTAAGT AGACATGCTT AGATATATAG TTTTAGGAAT AGTCATAGGA 1751 TTAGGGATAG GACACCAATG GGTTACAGTG TATTATGGAA CACCTAAATG 1801 GCACCCAGCT AGGACACATC TCTTTTGTGC AACAGATAAT AATTCCTTTT 1851 GGGTCACAAC AAGTTGTGTG CCCAGCCTAT TGCACTATGA AGAACAACAC 1901 ATTCCCAACA TAACAGAAAA CTTCACAGGC CCCATAACAG AGAATGAAGT 1951 AATAAGACAA GCATGGGGAG CTATCTCTTC CATGATAGAT GCAGTCTAA 2001 AACCCTGTCT AAAGCTGACA CCATATTGTG TCAAGATGAA ATGCACAAAG 2051 GGAGATACTG ATACTACAGA AAGGACAACA TCAACCACTC CCTCTTGGTC 2101 CACATCCACC CCAACCTCTA CCCCTATGAC TCCCAATACC ACTGGATTAG 2151 ATATAGACTC AAACAATACA GAACCCACAA CACAAGAGAA TCGGATATGT 2201 AAATTTAATA CTACAGGATT ATGTAGAGAC TGCAGATTGG AAATAGAAGA 2251 AAACTTCACA TATCAGGATA TAACATCTAG AAATAGTAGT GAAGATACTG 2301 AAGAGTGCTA TATGACACAT TGTAACTCAT CAGTAATAAC ACAGGATTGC 2351 AATAAGGCAT CAACAGATAA AATGACTTTT AGGTTGTCTG CACCACCAGG 2401 ATATGTCCTG TTGAGATGTA GAGAAAAGCT AAACCAAACC AAATTGTGTG 2451 GCAATATTAC AGCAGTGCAA TGCACTGACC CAATGCCTGC AACTATATCC 2501 ACTATGTTTG GATTTAATGG GACCAAACAT GACTATGATG AGCTAATTTT 2551 AACAAACCCT CAAAAGATAA ATGAGTTTCA TGATCACAAG TATGTATATA 2601 GAGTTGATAA AAAATGGAAG CTACAGGTAG TATGTAGAAG AAAAGGGAAT 2651 AGATCAACAA TATCAACGCC AAGTGCTACG GGCTTATTGT TCTATCATGG 2701 GCTACAACCA GGGAAAAATT TAAAAAAGGG GATGTGCCAG CTGAAGGGAT 2751 TATGGGGAAA GGCCATGCAC CAACTATCAG AGGAACTTAG AAAGATAAAT 2801 GGAAGTATTT ATAGAAAATG GAATGAGACA GCAGGCTGCA GAAAGCTAAA 2851 CAAACAGAAC GGTACAGGTT GCTCATTGAA AACAATAGAA GTTAGTGAGT 2901 ACACCACGGA GGGCGATCCG GGGGCAGACA CAATTATGCT TCTTTGTGGA 2951 GGTGAGTATT TCTTTTGTAA TTGGACAAAG ATTTGGAAGA CATGGAATAA 3001 CCAGACGTCA AATGTCTGGT ATCCTTGGAT GTCATGCAAT ATTAGACAAA 3051 TTGTAGATGA TTGGCATAAA GTAGGGAAAA AAATTTATAT GCCTCCTGCA 3101 AGTGGATTTA ACAAIGAGAT AAGGTGTACT AATGATGTCA CGGAAATGTT 3151 CTTTGAGGTT CAGAAGAAGG AAGAGAATAA ATATTTAATA AAGTTTATAC 3201 CTCAAGATGA GATACAAAAT CAGTATACAG CAGTAGGAGC ACATTATAAA 3251 TTGGTGAAAG TGGATCCTAT AGGGTTCGCA CCCACAGATG TGCATAGATA 3301 CCATCTACCA GATGTAAAGC AGAAGAGAGG AGCAGTCTTG CTTGGAATGC 3351 TCGGCCTCTT AGGTTTGGCA GGTTCCGCGA TGGGCTCAGT GGCGATAGCA 3401 CTGACGGTCC AGTCCCAGGC TTTATTGAAT GGGATTGTGG AGCAGCAGAA 3451 GGTTCTGCTG AGCCTGATAG ATCAGCACTC CGAGTTATTA AAACTAACTA 3501 TCTGGGGTGT AAAAAATCTT CAGGCCCGCC TCACAGCCTT GGAGGAATAC 3551 GTAGCGGACC AATCAAGACT GGCAGTATGG GGATGCTCAT TCTCTCAAGT 3601 ATGCCACACT AATGTAAAGT GGCCTAATGA TTCAATAGTT CCTAACTGGA 3651 CCTCGGAAAC ATGGCTTGAA TGGGATAAAA GAGTGACAGC AATTACAACA 3701 AATATGACAA TAGACTTGCA GAGGGCATAT GAATTCGAAC AAAAGAATAT 3751 GTTTGAGCTT CAAAAATTAG GAGATCTCAC CTCCTGGGCC AGCTGGTTCG 3801 ACCTCACGTG GTGGTTTAAA TATATTAAGA TAGGAATTCT TATAATAATA 3851 GTGATAATAG GACTTAGAAT ATTAGCTTGC TTATGGTCAG TATTAGGCAG 3901 GTTTAGGCAG GGTTACCGCC CTCTTCCTTA TGTCTTCAAG GCAGACTATC 3951 ACCGACCCCA CAACCTCAAA CAGCCAGACA AAGAAAGAGG AGAAGAGCAA 4001 GACAGAGAAA AACAGAACAT CAGCTCAGAG AATTACAGGC CAGGATCTGG 4051 CAGAGCTTGG AGCAAAGAGC AAGTAGAGAC CTGGTGGAAG GAGTCCAGGC 4101 TCTACATTTG GTTGAAGAGC ACACAAGCAG TAATTGAATA TGGGTGGCAA 4151 GAGCTCAAAG CAGCAGGAGC AGAAATATAT AAAATATTAC AGAGCGCTGC 4201 GCAGAGGCTA TGGAGCGGAG GGCACCAACT CGGACTATCA TGTATTAGAG 4251 CAGCTACAGC CTTTGGCAGA GGAGTCAGAA ACATTCCTAG ACGCATCAGA 4301 CAAGGAGCAG AAGTCTTACT CAACTGAGTT AGACTTAAGA CATCAACAAG 4351 ATGTAAGCCT CCCCACAGAA GAAGAACAGC CTTGGGAAGA GGAAGAGGAG 4401 GTAGGCTTTC CAGTCTACCC ACGACAGCCT GTGCATGAAG CCACCTATAA 4451 AGACTTGATA GACCTGTCCC ACTTTTTAAA AGAAAAGGGG GGACTGGAAG 4501 GGATTTGGTG GTCTAAAAGA AGAGAAGAAA TCTTGGATAT ATATGCACAA 4551 AATGAATGGG GAATTATACC TGACTGGCAG GCTTACACTT CAGGACCGGG 4601 GATCAGGTAT CCAAAAGCAT TTGGGTTCCT GTTTAAACTG ATCCCAGTGG 4651 CAGTTCCACC GGAACAAGAG AACAATGAAT GCAATAGGCT GCTAAACTCT 4701 TCTCAGACAG GAATCCAGGA AGATCCATGG GGAGAAAGGC TCATGTGGAA 4751 GTTTGACTCT GCTCTTGCCT ATACTTTCTA TGCTCCCATA AAGAGGCCAG 4801 GAGACTTCAA GCATGTCCAA AGTCTTAGCT ATGAAGCTTA TAAGAAGGAA 4851 CCTGACTGCT GCAAGAGGAA GTGGTGGCGC TTCTAGCCGA CCACAGAGGG 4901 TTGCTATGGC GATACCCTTT AAAACTGCTA ACTCTGGAGG GACTTTCCAC 4951 TAGTGCATGC GCACTGGACT GGGGACTTTC CAGGATGACG CCGGGTGGGG 5001 GAGTGGTCAG CCCAATCTGG CTGCATATAA GCAGCTCGCT TTGCGCTTGT 5051 ATTGAGTCTC TCCCTGAGAG CCTACCAGAT TGAGCCTAGG TTGTTCTCTG 5101 GTGAGTCCTT GAAGGAGTGC CTGCTTGTAG CCCTGGGCGG TTCGCAGGCC 5151 CCTGGCTTGT AGCTCTGGGT AGCTCGTCAG GTGTTCTGGA AAGGTCTTGC 5201 TAAGGGGACG CCTTTGCTTG GTCTTGGTAG ACCTCTAGCA GTCTCAGTGG 5251 CCAGGAGGCT GTGGGATTCA CTACCGCTTG CTTGCCTTTG ATGCTCAATA 5301 AAGCTTACCC GAATTAGAAA GGCATTCAAG TGTACTCGCT CATTTTGTCT 5351 TTGGTAGAAA CTCTGGTTAC TGGAGATCCC TCAGATTTGT GCCAGAGTTC 5401 TGATATCTAG TGAGAGTAGA GAAAAATC

[0084] The total sequence which results from the sum of the sequences according to Tables 2 to 6 is shown in Table 7: TABLE 7 1 TRDCTAGAGA TCCCTCAGAT TTGTGCCAGA CTTCTGATA CTAGTGAGAG 51 TAGAGAAAAA TCTCCAGCAG TGGCGCCCGA ACAGGGACTT GACGAAGAGC 101 CAAGTCATTC CCACCTGTGA GGGACAGCGG CGGCAGCCGG CCGGACCGAC 151 CCACCCGGTG AAGTGAGTTA ACCAAGGAGC CCCGACGCGC AGGACACAAG 201 GTAAGCGGTG CACCGTGCTG TAGTGAGTGT GTGTCCAGGA TCCGCTTGAG 251 CAGGCGAGAT CGCCGAGGCA ACCCCAGTAG AAAAAGAAAA GAGGGGAAGT 301 AAGGCCGAGG CAAAGTGAAA GTAAAAGAGA TCCTCTGAGA AGAGGAACAG 351 GGGGCAATAA AATTGGCGCG AGCGCGTCAG GACTTAGGGG AAGAGAATTG 401 GATGAGCTGG AAAAGATTAG GTTACGGCCC TCCGGAAAGA AAAAATACCA 451 GCTAAAACAT GTGATATGGG TAAGCAAGGA ACTAGATAGA TTTGGCCTAC 501 ATGAAAAGTT GTTAGAAACC AAGGAAGGAT GCGAAAAAAT TCTTAGCGTA 551 CTCTTTCCTC TAGTTCCTAC AGGGTCAGAA AATTTAATTT CGCTGTACAA 601 CACCTGCTGT TGCATTTGGT GCGTACATGC GAAACTGAAA GTAGCAGATA 651 CAGAAGAGGC AAAAGAGAAA GTAAAACAAT GCTACCATCT AGTGGTTGAA 701 AAACAGAATC CAGCCTCAGA AAAAGAAAAA GGAGCAACAG TGACACCTAG 751 TGGCCACTCA AGAAATTACC CCATTCAGAT AGTAAATCAA ACCCCAGTAC 801 ACCAGGGAAT TTCTCCCAGA ACACTGAATG CTTGGGTAAA ATGTATAGAG 851 GAGAAGAAAT TCAGCCCAGA AATAGTGCCT ATGTTCATAG CTTTGTCAGA 901 AGGATGCCTC CCATACGACC TCAACGGCAT GCTCAATGCC ATTGGGGACC 951 ATCAGGGAGC TCTCCAAATA GTGAAAGATG TCATCAATGA CGAAGCTGCA 1001 GACTGGGATC TTAGACATCC TCAGATGGGG CCTATGCCCC AAGGGGTGCT 1051 AAGAAACCCA ACAGGGAGTG ACATAGCAGG AACCACCAGC AGCATAGAAG 1101 AACAAATTGA ATGGACAACT AGGCAGCAAG ATCAGGTAAA TGTAGGAGGA 1151 ATTTACAAAC AATGGATAGT TCTGGGATTG CAAAAATGTG TGACCATGTA 1201 CAATCCAGTG AATATTCTAG ATATAAAACA GGGACCAAAA GAACCCTTTA 1251 AGGACTATGT GGATCGATTT TACAAAGCTC TGCGGGCGGA GCGAACAGAT 1301 CCACAAGTGA AAAACTGGAT GACGCAGACA TTGCTCATCC AGAATGCAAA 1351 CCCAGATTGT AAAGCCATTC TTAAGGGATT AGGCATGAAC CCCACCTTGG 1401 AAGAAATGTT ATTGGCATGT CAAGGAGTAG GGGGACCAAA GTATAAAGCT 1451 CAAATGATGG CAGAAGCAAT GCAGGAGGTG CAAGGAAAAA TTATGATGCA 1501 AGCCTCGGGA GGACCACCGC GGGGTCCCCC AAGGCAGCCA CCCAGAAATC 1551 CTAGATGCCC CAACTCTGGA AAGTTTGGAC ATGTACTGAG AGACTGTAGA 1601 GCCCCAAGAA AGCGAGGATG CTTCAAGTGT GGAGATCCAG GACATGTGAT 1651 GAGAAACTGC CCAAAGATGG TGAATTTTTT AGGGAATGCT CCCTGGGGCA 1701 GTGGCAAACC CAGGAACTTT CCTGCCGTGC CACTGACCCC AACGGCACCC 1751 CCGATGCCAG GATTAGAGGA CCCAGCAGAG AAGATGCTAC TOGATTACAT 1801 GAAGAAGGGG CAACAGATGA AGGCAGAGAG GGAAGCCAAA CGGGAGAAGG 1851 ACAAAGGCCC TTACGAGGCG GCTTACAACT CCCTCAGTTC TCTCTTTGGA 1901 ACAGACCAAC TACAGTAGTA GAGATAGAGG GGCAAAAAGT GGAGGCCCTA 1951 CTAGATACAG GAGCAGATGA CACAGTAATC AAAGATTTAC AATTAACAGG 2001 CAATTGGAAA CCACAAATCA TAGGAGGAAT TGGAGGAGCA ATTAGGGTAA 2051 AGCAATATTT CAATTGTAAA ATAACAGGG CAGGTAAAAG CACTCATGCT 2101 TCACTACTAG TGGGCCCCAC TCCTCTAAAT ATTATAGGTA GAAATGTAGT 2151 TAAAAAGTTA GGATGTACTT TGAACTTTCC TATTAGTAAG ATAGAAACAG 2201 TAAAGGTAAC ACTAAAACCA GGAACTGATG GACCAAGAAT CAAACAGTGG 2251 CCACTGTCTA AAGAAAAGAT TTTAGCCTTA CAAGAAATAT GCAATCAGAT 2301 GGAAAAAGAA GGCAAAATCT CTAGAATAGG TCCAGAAAAT CCTTACAACA 2351 CACCAGTGTT TTGTATAAAA AAGAAAGATG GAGCCAGCTG GAGAAAACTG 2401 GTAGATTTTA GACAATTGAA TAAAGTGACA CAGGATTTCT TTGAGGTGCA 2451 GCTAGGAATC CCACATCCTG GAGGTCTAAA ACAATGTGAA CAGATCACAG 2501 TATTGCATAT AGGAGATGCC TATTTTTCAT GCCCATTGGA TGAGGACTTT 2551 AGAAAGTATA CTGCATTCAC CATTCCATCG GTGAATAATC AGGGCCCAGG 2601 AATCAGATAC CAGTATAATG TCCTCCCACA GGGATGGAAA GGCTCTCCAG 2651 CAATTTTTCA GGCAACAGCT GATAAAATCT TGAAAACATT CAAAGAAGAA 2701 TACCCAGAGG TATTAATTTA TCAGTATATG GATGATCTGT TCGTCCGAAG 2751 TGACTTAAAT GCCACTGAAC ATAACAAAAT GATAAACAAG TTGAGAGAGC 2801 ATCTGAGATT CTGGGGGCTC GAGACCCCAG ATAAGAAGTT TCAAAAGGAA 2851 CCTCCTTTTG AATGGATGGG ATATGTGCTA CACCCAAAGA AATGGACAGT 2901 GCAGAAAATA CAACTACCAG AAAAAGAGCA ATGGACAGTG AATGATATTC 2951 AGAAATTGGT AGGAAAACTT AATTGGGCAA GTCAGATATA TTCCGGAATT 3001 AAAACAAAAG AGCTCTGTAA ATTGATCAGA GGAGCAAAAC CTCTAGATGA 3051 AATAGTAGAA TGGACAAGAG AAGCAGAATT AGAGTATGAA GAGAATAAGA 3101 TAATAGTGCA GGAGGAGGTG CATGGAGTGT ACTATCAGCC AGAAAAACCA 3151 CTGATGGCAA AAGTACAAAA GTTGACACAA GGACAGTGGA GTTATCAAAT 3201 AGAGCAAGAA GAAAACAAAC CTCTCAAGGC AGGAAAATAT GCCAGGACAA 3251 ACAATGCCCA CACAAATGAG TTAAGGACAC TTGCAGCGTT ACTACAAAAA 3301 ATAGCCAAGG AATGCATAGT AATCTCGGCA AGATTGCCAA AATTTTACCT 3351 CCCCTTGGAG AGAGAAGTAT GGGATCAATG GTGGCATGAT TATTGGCAGG 3401 TAACATGGAT CCCAGAGTGG GAATTCATCT CAACACCACC ATTGATAAGG 3451 CTATGGTACA ACCTCCTGAA AGAACCAATT CCAGGAGAAG ATGTATACTA 3501 TGTAGATGGG GCAGCTAACA GAAATTCTAA AGAAGGCAAG GCACGATACT 3551 ATACAGCAAG GGGCAAAAGT AAGGTAATAC CTTTAGAAAA TACAACCAAT 3601 CACAAGGCAC AGCTGAAGGC AATAGAATTA GCCCTAAAAG ATTCAGGACC 3651 AAGAGTAAAC ATAGTAACAC ATTCCCAGTA TGCATTAGGC ATACTCACAG 3701 CATCCCCACA TCAGTCAGAT AACCCCATAG TTAGGGAAAT AATTAACCTC 3751 ATGATAGCCA AGGAAGCAGT CTACCTGTCA TGGGTACCAG CCCACAAGGG 3801 TATAGGAGGT AACGAACAAA TAGACAAATT AGTAAGCCAA GGAATTAGGC 3851 AAGTACTATT CCTGGAAGGA ATAGACAGAG CTCAGGAACA ACACGACAAA 3901 TATCATAACA ACTGGAGAGC TTTAGCTCAG CAATTCAGCA TACCTCCTAT 3951 AGTGGCAAAA GAGATAGTTG CACAATGCCC AAAATGCCAG ATAAAAGGGG 4001 AACCTATTCA TGGCCAGGTA GATGCAAGTC CTGGGACATG GCAAATGGAT 4051 TGCACCCATC TAGAAGGAAA GGTCATCATA GTGGCAGTCC ATGTAGCCAG 4101 TGGATACCTA GAGGCAGAAG TAATACCAGC AGAGACAGGA AAAGAGACAG 4151 CACATTTCCT GTTAAAGTTA GCAGGCAGGT GGCCTGTAAA ACATTTACAC 4201 ACTGACAATG GCCCCAACTT TGTCAGTGAA AAGGTAGCCA CAGTCTGTTG 4251 GTGGGCTCAA ATAGAGCACA CCACAGGTGT ACCCTATAAC CCCCAGAGTC 4301 AGGGAGTAGT GGAAGCAAAG AATCATCATC TTAAGACAAT CATAGAACAA 4351 GTTAGGGATC AAGCAGAAAA ATTAGAAACA GCAGTACAAA TGGCAGTATT 4401 AATACACAAT TTTAAAAGAA AAGGGGGGAT AGGGGAGTAT AGTCCAGGAG 4451 AAAGAATAGT AGATATCATA ACCACAGACA TTCTAACAAC TAAATTACAA 4501 CAAAATATTT CAAAAATTCA AAATTTTCGG GTTTATTACA GAGAAGGAAG 4551 GGATCAACAG TGGAAAGGAC CAGCAGAACT CATTTGGAAA GGAGAAGGCG 4601 CTGTGGTGAT TAAAGAAGGG ACACACTTAA AGGTGGTACC AAGAAGAAAA 4651 GCCAAAATCA TCAGACATTA TGGAAAAGCA GTGGATAGTA ATTCCCACAT 4701 GGAGAGTAGA GAGGAATCAG CTTGAGAAAT GCAATTCATT AGTAAAATAT 4751 CATAAATATA GGGGAGAAAA ATACCTAGAA AGATGGGAAC TATACCACCA 4801 TTTCCAATGC TCGGGGTGGT GGACACACTC TAGAAAAGAT GTTTACTTTA 4851 AAGATGGCTC AGTAATAAGC ATTACTCCCT TCTGGAATCT TACCCCAGAG 4901 AAAGGATGGT TGTCTCAATA TGCAGTTACA ATAGAATATG TAAAAGAAAG 4951 CTATTATACT TACATAGACC CAGTTACAGC AGACAGAATG ATTCATTGGG 5001 AATATTTCCC ATGTTTTACA GCCCAGGCTG TGAGAAAAGT ACTGTTTGGA 5051 GAAAGACTAA TAGCTTGCTA CAGCCCCTGG GGACACAAAG GACAGGTAGG 5101 GACTCTACAA TTCCTGGCTT TGCAAGCTTA CCTTCAGTAT TGTAAACATG 5151 GCAGAAAGAG CACCAGAAGT GCCGGAAGGG GCAGGAGAGA TACCTCTAGA 5201 ACAGTGGCTA GAAAGATCAT TAGAACAACT CAACAGAGAG GCCCGGTTAC 5251 ACTTCCACCC AGAGTTCCTT TTCCGTCTTT GGAACACTTG TGTAGAACAT 5301 TGGCATGATA GACACCAGAG GAGCCTGGAG TATGCAAAAT ACACATATCT 5351 TTTGTTCGTG CATAAGGCCA TGTTTACCCA TATCCAACAG GGATGCCCAT 5401 GTAGAAATGG GCACCCAAGA GGACCTCCTC CTCCAGGATT GGCCTAATTT 5451 CTGTCTTGCA GATGGAACAG CCACCTGAGG ACGAGGCTCC ACAGAGAGAA 5501 CCTTATAATG AATGGCTGAT AGATACCTTG GCAGAAATCC AGGAAGAAGC 5551 TTTGAAGCAT TTTGATAGGC GCTTGCTACA TGCAGTAGGC TCATGGGTGT 5601 ATGAGCAACA GGGAGACACC TTAGAAGGTG TCCAAAAAGT AATAACTATT 5651 CTACAAAGAG CTTTGTTTTT GCACTTCAGG CATGGATGCA GGGAAAGCCG 5701 CATTGCACAA GCAGGAGGGA AATATAATTC CCTCAGATCC TTTCCAAGGC 5731 CAGACAACCC CTTGTAATAA ATGCTATTGT AAAAGATGTT GCTATCACTG 5801 CCAGTTATGC TTCTTGCAGA AAGCCTTAGG GATAGATTAT GATGTCTACA 5851 GAGTCAGGAG ACCTCGACAG AGATTTTTGG GCGAAGTACC ACCACATAGT 5901 GCAGCAACTG TGGAAAGGTA AGTAAAAAGT AAGTAGACAT GCTTAGATAT 5951 ATAGTTTTAG GAATAGTCAT AGGATTAGGG ATAGGACACC AATGGGTTAC 6001 AGTGTATTAT GGAACACCTA AATGGCACCC AGCTAGGACA CATCTCTTTT 6051 GTGCAACAGA TAATAATTCC TTTTGGGTCA CAACAAGTTG TGTGCCCAGC 6101 CTATTGCACT ATGAAGAACA ACACATTCCC AACATAACAG AAAACTTCAC 6151 AGGCCCCATA ACAGAGAATG AAGTAATAAG ACAAGCATGG GGAGCTATCT 6201 CTTCCATGAT AGATGCAGTC TTAAAACCCT GTGTAAAGCT GACACCATAT 6251 TGTGTCAAGA TGAAATGCAC AAAGGGAGAT ACTGATACTA CAGAAAGGAC 6301 AACATCCACC ACTTCCTCTT GGTCCACATC CACCCCAACC TCTACCCCTA 6351 TGACTCCCAA TACCACTGGA TTAGATATAG ACTCAAACAA TACAGAACCC 6401 ACAACACAAG AGAATCGGAT ATGTAAATTT AATACTACAG GATTATGTAG 6451 AGACTGCAGA TTGGAAATAG AAGAAAACTT CAGATATCAG GATATAACAT 6501 GTAGAAATAG TAGTGAAGAT ACTGAAGAGT GCTATATGAC ACATTGTAAC 6551 TCATCAGTAA TAACACAGCA TTGCAATAAG GCATCAACAG ATAAAATGAC 6601 TTTTAGGTTG TGTGCACCAC CAGGATATGT CCTGTTGAGA TGTAGAGAAA 6651 AGCTAAACCA AACCAAATTG TGTCGCAATA TTACAGCAGT GCAATGCACT 6701 GACCCAATGC CTGCAACTAT ATCCACTATG TTTGGATTTA ATGGGACCAA 6751 ACATGACTAT GATGAGCTAA TTTTAACAAA CCCTCAAAAG ATAAATGAGT 6801 TTCATGATCA CAAGTATGTA TATAGAGTTG ATAAAAAATG GAAGCTACAG 6851 GTAGTATGTA GAAGAAAAGG GAATAGATCA ATAATATCAA CGCCAAGTGC 6901 TACGGGCTTA TTGTTCTATC ATGGGCTAGA ACCAGGGAAA AATTTAAAAA 6951 AGGGGATGTG CCAGCTGAAG GGATTATGGG GAAAGGCCAT GCACCAACTA 7001 TCAGAGGAAC TTAGAAAGAT AAATGGAAGT ATTTATAGAA AATGGAATGA 7051 GACAGCAGGC TGCAGAAAGC TAAACAAACA GAACGGTACA GGTTGCTCAT 7101 TGAAAACAAT AGAAGTTAGT GACTACACCA CGGAGGGCGA TCCGGGGGCA 7151 GAGACAATTA TGCTTCTTTG TGGAGGTGAG TATTTCTTTT GTAATTGGAC 7201 AAAGATTTGG AAGACATGGA ATAACCAGAC GTCAAATGTC TGGTATCCTT 7251 GGATGTCATG CAATATTAGA CAAATTGTAG ATGATTGGCA TAAAGTAGGG 7301 AAAAAAATTT ATATGCCTCC TGCAAGTGGA TTTAACAATG AGATAAGGTG 7351 TACTAATGAT GTCACGGAAA TGTTCTTTGA GGTTCAGAAG AAGGAAGAGA 7401 ATAAATATTT AATAAAGTTT ATACCTCAAG ATGAGATACA AAATCAGTAT 7451 ACAGCAGTAG GAGCACATTA TAAATTGGTG AAAGTGGATC CTATAGGGTT 7501 CGCACCCACA GATGTGCATA GATACCATCT ACCAGATGTA AAGCAGAAGA 7551 GAGGAGCAGT CTTGCTTGGA ATGCTCGGCC TCTTAGGTTT GGCAGGTTCC 7601 GCGATGGGCT CAGTGGCGAT AGCACTGACG GTCCAGTCCC AGGCTTTATT 7651 GAATGGGATT GTGGAGCAGC AGAAGGTTCT GCTGAGCCTG ATAGATCAGC 7701 ACTCCGAGTT ATTAAAACTA ACTATCTGGG GTGTAAAAAA TCTTCAGGCC 7751 CGCCTCACAG CCTTGGAGGA ATACGTAGCG GACCAATCAA GACTGGCAGT 7801 ATGGGGATGC TCATTCTCTC AAGTATGCCA CACTAATGTA AAGTGGCCTA 7851 ATGATTCAAT AGTTCCTAAC TGGACCTCGG AAACATGGCT TGAATGGGAT 7901 AAAAGAGTGA CAGCAATTAC AACAAATATG ACAATAGACT TGCAGAGGGC 7951 ATATGAATTG GAACAAAAGA ATATGTTTGA GCTTCAAAAA TTAGGAGATC 8001 TCACCTCCTG CGCCAGCTGG TTCGACCTCA CGTGGTGGTT TAAATATATT 8051 AAGATAGGAA TTCTTATAAT AATAGTGATA ATAGGACTTA GAATATTAGC 8101 TTGCTTATGG TCAGTATTAG GcAGGTTTAG GCAGGGTTAC CGCCCTCTTC 8151 CTTATGTCTT CAAGGGAGAC TATCACCGAC CCCACAACCT CAAACAGCCA 8201 GACAAAGAAA GAGGAGAAGA GCAAGACACA GAAAAACAGA ACATCAGCTC 8251 AGAGAATTAC AGGCCAGGAT CTGGCAGAGC TTGGAGCAAA GAGCAAGTAG 8301 AGACCTGGTG GAAGGAGTCC AGGCTCTACA TTTGGTTGAA GAGCACACAA 8351 GCAGTAATTG AATATGGGTG GCAAGAGCTC AAAGCAGCAG GAGCAGAAAT 8401 ATATAAAATA TTACAGAGCG CTGCGCAGAG GCTATGGAGC GGAGGGCACC 8451 AACTCGGACT ATCATGTATT AGAGCAGCTA CAGCCTTTGG CAGAGGAGTC 8501 AGAAACATTC CTAGACGCAT CAGACAAGGA GCAGAAGTCT TACTCAACTG 8551 AGTTAGACTT AAGACATCAA CAAGATGTAA GCCTCCCCAC AGAAGAAGAA 8601 CAGCCTTGGG AAGAGGAAGA GGAGGTAGGC TTTCCAGTCT ACCCACGACA 8651 GCCTGTGCAT GAAGCCACCT ATAAAGACTT GATAGACCTG TGCCACTTTT 8701 TAAAAGAAAA GGGGGGACTG GAAGGGATTT GGTGGTCTAA AAGAAGAGAA 8751 GAAATCTTGG ATATATATGC ACAAAATGAA TGGCGAATTA TACCTGACTG 8801 GCAGGCTTAC ACTTCAGGAC CGGGGATCAG GTATCCAAAA GCATTTGGGT 8851 TCCTGTTTAA ACTGATCCCA GTGGCAGTTC CACCGGAACA AGAGAACAAT 8901 GAATGCAATA GGCTGCTAAA CTCTTCTCAG ACAGGAATCC AGGAAGATCC 8951 ATGGGGAGAA AGGCTCATGT GGAAGTTTGA CTCTGCTCTT GCCTATACTT 9001 TCTATGCTCC CATAAAGAGG CCAGGAGAGT TCAAGCATGT CCAAAGTCTT 9051 AGCTATGAAG CTTATAAGAA GGAACCTGAC TGCTGCAAGA GGAAGTGGTG 9101 GCGCTTCTAG CCGACCACAG AGGGTTGCTA TGGCCATACC CTTTAAAACT 9151 GCTAACTCTG GAGGGACTTT CCACTAGTGC ATGCGCACTG CACTGGGGAC 9201 TTTCCAGGAT GACGCCGGGT GGGGGAGTGG TCAGCCCAAT CTGGCTGCAT 9251 ATAAGCACCT CGCTTTGCGC TTGTATTGAG TCTCTCCCTG AGAGGCTACC 9301 AGATTGAGCC TAGGTTGTTC TCTGGTGAGT CCTTGAAGGA GTGCCTGCTT 9351 GTAGCCCTGG GCGGTTCGCA GGCCCCTGGC TTGTACCTCT GGGTAGCTCG 9401 TCAGGTGTTC TCCAAAGGTC TTGCTAAGGG GACGCCTTTG CTTGGTCTTG 9451 GTAGACCTCT AGCAGTCTCA GTGGCCAGGA GGCTGTGGGA TTGACTACCG 9501 CTTGCTTCCC TTTGATGCTC AATAAAGCTT ACCCOAATTA GAAAGGCATT 9551 CAAGTGTACT CGCTCATTTT GTCTTTGGTA GAAACTCTGG TTACTGGAGA 9601 TCCCTCAGAT TTGTGCCAGA CTTCTGATAT CTAGTGAGAG T

[0085] In 3 reading frames, the nucleotide sequence was converted into amino acid sequences, after which the amino acid sequences of GAG (Table 8), POL (Table 9) and ENV (Table 10) were identified by homology comparisons. TABLE 8 GAG: 1 IGASASGLRG RELDELEKIR LRPSGKKKYQ LKHVIWVSKE LDRFGLHEKL 51 LETKEGCEKI LSVLFPLVPT GSENLISLYN TCCCIWCVHA KVKVADTEEA 101 KEKVKQCYHL VVEKQNAASE KEKGATVTPS GHSRNYPIQI VNQTPVHQGI 151 SPRTLNAWVK CIEEKKESPE IVPMFIALSE GCLPYDLNGM LNAIGDHQGA 201 LQIVKDVIND EAADWDLRHP QMGPMPQGVL RNPTGSDIAG TTSSIEEQIE 251 WTTRQQDQVN VGGIYKQWIV LGLQKCVSMY NPVNILDIKQ GPKEPFKDYV 301 DRFYKALRAE RTDPQVKNWM TQTLLIQNAN PDCKAILKGL GMNPTLEEML 351 LACQGVGGPK YKAQMMAEAM QEVQGKIMMQ ASGGPPRGPP RQPPRNPRCP 401 NCGKFGHVLR DCRAPRKRGC FKCGDPGHLM RNCPKMVNFL GNAPWGSGKP 451 RNFPAVPLTP TAPPMPGLED PAEKMLLDYM KKGQQMKAER EAYREKDKGP 501 YEAAYNSLSS LFGTDQLQ

[0086] TABLE 9 POL: 1 FFRECSLGQW QTQELSCRAT DPNGTPDARI RGPSREDATG LHEBGATDEG 51 REGSQTGEGQ RPLRGGLQLP QFSLWNRPTT VVEIEGQKVE ALLDTGADDT 101 VIKDLQLTGN WKPQIIGGIG GAIRVKQYFN CKITVAGKST HASVLVGPTP 151 VNIIGRNVLK KLGCTLNEPI SKIETVKVTL KPGTDGPRIK QWPLSKEKIL 201 ALQEICNQME KEGKISRIGP ENPYNTPVFC IKKKDGASWR KLVDFRQLNK 251 VTQDFFEVQL GIPHPGGLKQ CEQITVLDIG DAYFSCPLOE DFRKYTAFTI 301 PSVNNQGPGI RYQYNVLPQG WKGSPAIFQA TADKILKTFK EEYPEVLIYQ 351 YMDDLFVGSD LNATEHNKMI NKLREHLRFW GLETPDKKFQ KEPPFEWMGY 401 VLHPKKWTVQ KIQLPEKEQW TVNDIQKLVG KLNWASQIYS GIKTKELCKL 451 IRGAKPLDEI VEWTREAELE YEENKIIVQE EVHGVYYQPE KPLMAKVQKL 501 TQGQWSYQIE QEENKPLKAG KYARTKNAHT NELRTLAGLV QKIAKECLVT 551 WGRLPKFYLP LEREVWDQWW HDYWQVTWIP EWEFISTPPL IRLWYNLLKE 601 PIPGEDVYYV DGAANRNSKE GKAGYYTARG KSKVIALENT TNQKAELKAI 651 ELALKDSGPR VNIVTDSQYA LGILTASPDQ SDNPIVREII NLMIAKEAVY 701 LSWVPAHKGI GGNEQIDKLV SQGIRQVLFL EGIDRAQEEH DKYHNNWRAL 751 AQEFSIPPIV AKEIVAQCPK CQIKGEPIHG QVDASPGTWQ MDCTHLEGKV 801 IIVAVHVASG YLEAEVIPAE TGKETAHFLL KLAGRWPVKH LHTDNGPNFV 851 SEKVATVCWW AQIEHTTGVP YNPQSQGVVE AKNHHLKTII EQVRDQAEKL 901 ETAVQMAVLI HNFKRKGGIG EYSPGERIVD IITTDILTTK LQQNISKIQN 951 ERVYYREGRD QQWKGPAELI WKGEGAVVIK EGTDLKVVPR RKAKIIRDYG 1001 KAVDSNSHME SREESA*

[0087] TABLE 10 ENV: 1 QWVTVYYGTP KWHPARTHLF CATDNNSFWV TTSCVPSLLH YEEQHIPNIT 51 ENFTGPITEN EVIRQAWGAI SSMIDAVLKR CVKLTPYCVK MKCTKGDTDT 101 TERTTSTTSS WSTSTPTSTP MTPNTTGLDI DSNNTEPTTQ ENRTCKFNTT 151 GLCRDCRLEI EENFRYQDIT CRNSSEDTEE CYMTHCNSSV TTQDCNKAST 201 DKMTFRLCAP PGYVLLRCRE KLNQTKLCGN ITAVQCTDPM PATISTMFGF 251 NGTKHDYDEL ILTNPQKINE FHDHKYVYRV DKKWKLQVVC RRKGNRSIIS 301 TPSATGLLFY HGLEPGKNLK KGMCQLKGLW GKAMHQLSEE LRKINGSIYR 351 KWNETAGCRK LNKQNGTGCS LKTTEVSEYT TEGDEGAETI MLLCGGEYFF 401 CNWTKIWKTW NNQTSNVWYP WMSCNIRQIV DDWHKVGKKI YMPPASGFNN 451 EIRCTNDVTE MFFEVQKKEE NKYLIKFIPQ DEIQNQYTAV GAHYKLVKVD 501 PTGFAPTDVH RYHLPDVKQK RGAVLLGMLG LLGLAGSAMG SVAIALTVQS 551 QALLNGIVEQ QKVLLSLIDQ NDSIVPNWTS GVKNLQARLT ALEEYVADQS 601 RLAVWGCSFS QVCHTNVKWP LTSWASWFDL ETWLEWDKRV TAITTNMTID 651 LQRAYELEQK NMFELQKLGD LTSWASWEDL TWWFKYIKIG ILITIVIIGL 701 RILACLWSVL GRFRQGYRPL PYVFKGKYHR PHNLKQPDKE RGEEQDREKQ 751 NISSENYRPG SGRAWSKEQV ETWWKESRLY IWLKSTQAVI EYGWQELKAA 801 GAEIYKILQS AAQRLWSGGH QLGLSCIRAA TAFGRGVRNI PRRIRQGAEV 851 LLN*

EXAMPLE 4

[0088] Determination of the phylogenetic position of SIM27

[0089] Selection of the sequences:

[0090] From the HIV WWW server of the LANL (Los Alamos National Laboratory, hiv-web.lanl.gov), 31 HIV and SIV sequences were selected which all comprised complete SIV genomes and representatives of the various HIV-1 and HIV-2 subtypes. The following sequences according to Table 11 were taken into consideration. TABLE 11 Genbank Accession No.: Name: AF075269 SIV-1 'hoesti AF077017 SIV-SMM, PGM L06042 SIV-SYKES M27470 SIV-Mandrill, MNDGB1 L40990 SIV-VERVET, REV M29975 SIV-VERVET, AGM155 M30931 SIV-VERVET, AGM3 X07805 SIV-VERVET, AGMTY6 M66437 SIV-GRIVET, AGM677A U04005 SIV-SABAEUS, AGMSAB1 AF103818 SIV-CPZ-US U42720 SIV-CPZ, CPZANT X52154 SIV-CPZ, CPZGAB U58991 SIV-TANTALUS, TAN1 U72748 SIV, SME543 Y00277 SIV-D, MAC250 M32741 SIV-D, MNE M33262 SIV-D, MM239 L09213 SIV-D, SMM-PBJ-6P9 M80194 SIV-D, SMM9 M83293 IV-D, STM U51190 HIV1-A, 92ug037 AF110967 HIV1-C, 96bw05.02 M27323 HIV1-D, NDK AF005494 HIV1-F, 93br020.1 AF005496 HIV1-H, 90cr056 AJ006022 HIV1-N, YBF30 L20587 HIV1-O, ANT70C L20571 HIV1-O, MVP5180 D00835 HIV2-A, CAM2 M30895 HIV2-A, GH1 U27200 HIV2-B, EHO L07625 HIV2-B, UC1

[0091] With the aid of the Genbank accession numbers of these sequences, the actual sequence entries were extracted from the gene database “Genbank”. With the aid of annotation, the genes env, gag and pol were extracted from these sequences and translated into the amino acid sequence. For the translation, only those sequences were used which were annotated as functional. Pseudogenes and genome sections not annotated as one of the 3 genes were not taken into consideration.

[0092] In addition, the sequence of the genome of SIM27 was compared with the actual gene database “Genbank” in order not to overlook an SIV partial sequence having a high relationship to SIM27. 2 partial sequences of SIVrcm (gag and pol) and a pol partial sequence of Mandrillus leucophaeus (Clewley J P et al., J. Virol. 1998; 72: 10305-10309) were identified as additionally relevant here: RCM-GAG SIV, RCM gag RCM-POL SIV, RCM pol CLEW-POL SIV, Drill, Clewley

[0093] In total, 4 data sets were obtained in this way: 3 protein data sets (env, gag and pol), and one from genomic sequences (GENOME).

[0094] Alignment:

[0095] The above sequences were aligned together with the corresponding SIM27 sequences using CLUSTALW (Version 1.74) with standard settings (Thompson J. D et al., Nucleic Acids Res. 22: 4673-4680 (1994)) The sequence alignments thus obtained were then checked manually.

[0096] The published pol partial sequence of drill monkeys (Clewley et al.), and the pol partial sequence of the RCM monkey was added once more each to the pol sequence alignment in analyses which were separate in each case. The same was carried out for the GAG partial sequence of the RCM monkeys for the gag alignments

[0097] For the addition of the individual sequences to the alignments, the profile alignment option of CLUSTALW 1.74 was used with standard settings.

[0098] 3 further protein data sets with small partial sequences RCM-GAG, RCM-POL and DRILL-POL thus resulted. Each of these data sets was considered only with respect to the region of the respective partial sequence.

[0099] Phylogenetic analyses:

[0100] Using the above seven alignments (GENOME (FIG. 1), GAG (FIG. 2), RCM-GAG (FIG. 3), POL (FIG. 4), RCM-POL (FIG. 5), DRILL-POL (FIG. 6), ENV (FIG. 7)), phylogenetic family trees were then independently set up. For this, the neighbor-joining method, as is implemented in CLUSTALW 1.74, was used in 1000 boot strap analyses. To calculate the trees, the standard settings were used, only all alignment gaps with holes were ignored, and the correction for multiple mutations was switched on.

EXAMPLE 5

[0101] Detection of the diagnostic relevance in the Western blot

[0102] According to known methods of molecular biology (Current Protocols in Molecular Biology, Wiley Interscience, 1994) , the region of env containing the cysteine loop was stabily expressed either as a fusion with the maltose-binding protein (pMAL-New England Biolabs) or as a fusion with β-Gal (Knapp et al., Biotechniques, Vol. 8, No. 3, 1990). The proteins were blotted on nitrocellulose, incubated overnight with the sera in a dilution of 1:100 in TBS containing 5% skimmed milk (150 mM NaCl, 50 mM tris pH 8.0), washed with, TBS and incubated with anti-human IgG-AP (Sigma A064) and anti-monkey IgG-AP (Sigma A1929) for 2 h in a dilution of 1:1000 and stained according to the manufacturer's instructions by means of Nitrotetrazolium Blue (Sigma N-6878) and 5-bromo-4-chloroindolyl phosphate.p-toluidine (Bachem M105). The results shown in Table 9 were obtained (FIG. 9). TABLE 9 Anti- Anti-HIV1- Anti- Anti-SIV- Protein/ HIV1 subtype HIV2 drill 7 serum serum O serum serum serum PMAL-HIV1env +++ ++ − − PSEM-HIV1- ++ ++ − − subtype-O-env pMAL-HIV2-env − − +++ − pMAL-SIM27- − − − +++ env pMAL − − − − PSEM − − − −

[0103] It was surprisingly seen here that the env region of SIM27 does not react with anti-HIV-1, anti-HIV-1 subtype O and anti-HIV2 sera and at the same time antibodies from SIM27, which react strongly with SIM27-env, could not be detected by the use of HIV-1-env, HIV-1-subtype O env and HIV2-env. It is therefore to be assumed from this that in the case in which SIM27 or a variant with comparable serological properties ought to complete the transition into the human population, the detection of antibodies against SIM27 in human sera is not possible with the tests currently employed, but rather SIM27-env, or antigens derived therefrom having comparable immunological properties, have to be employed.

FIGURES

[0104]FIG. 1

[0105] Phylogenetic investigation of the sequences of Table 11 including the total genome of SIM27 as described in Example 4 by the multiple alignment and the neighbor-joining method of ClustalW Version 1.74

[0106]FIG. 2

[0107] Phylogenetic investigation of the GAG proteins extracted from the sequences of Table 11 including the GAG protein of SIM27 (Table 8) as described in Example 4 by the multiple alignment and the neighbor-joining method of ClustalW Version 1.74

[0108]FIG. 3

[0109] Phylogenetic investigation of the GAG proteins extracted from the sequences of Table 11 including the GAG protein of SIM27 (Table 8) and the GAG partial sequence of SIVrcm as described in Example 4 by the multiple alignment and the neighbor-joining method of ClustalW Version 1.74

[0110]FIG. 4

[0111] Phylogenetic investigation of the POL proteins extracted from the sequences of Table 11 including the POL protein of SIM27 (Table 9) as described in Example 4 by the multiple alignment and the neighbor-joining method of ClustalW Version 1.74

[0112]FIG. 5

[0113] Phylogenetic investigation of the POL proteins extracted from the sequences of Table 11 including the POL protein of SIM27 (Table 9) and the POL partial sequence of SIVrcm as described in Example 4 by the multiple alignment and the neighbor-joining method of ClustalW Version 1.74

[0114]FIG. 6

[0115] Phylogenetic investigation of the POL proteins extracted from the sequences of Table 11 including the POL protein of SIM27 (Table 9) and the POL partial sequence as published by Clewley (Clewley J P et al., J. Virol. 1998; 72: 10305-10309) and as described in Example 4 by the multiple alignment and the neighbor-joining method of ClustalW Version 1.74

[0116]FIG. 7

[0117] Phylogenetic investigation of the ENV proteins extracted from the sequences of Table 11 including the ENV protein of SIM27 (Table 10) as described in Example 4 by the multiple alignment and the neighbor-joining method of ClustalW Version 1.74

[0118]FIG. 8

[0119] General survey of the individual PCR amplifications which lead to the complete genomic nucleic acid sequence of SIM27.

[0120]FIG. 9

[0121] Western blot, as described in Example 5.

ABBREVIATIONS

[0122] HIV: Human immunodeficiency virus

[0123] SIV: Simian (monkey) immunodeficiency virus

[0124] HTLV: Human T-lymphoma virus

[0125] STLV: Simian T-lymphoma virus

[0126] p: Protein

[0127] gp: Glycoprotein

[0128] pol: Gene of the enzymes of HIV or SIV, designated according to the polymerase

[0129] gag: Gene of the core proteins of HIV or SIV

[0130] env: Gene of the surface glycoproteins/glycoproteins of HIV or SIV

[0131] IN: Integrase

[0132] RT: Reverse transcriptase

[0133] PR: Protease

1 57 1 20 DNA Unknown misc_feature ()..() primer, non-genomic DNA 1 gccatgtgtc caaaatgtca 20 2 20 DNA Unknown misc_feature ()..() primer, non-genomic DNA 2 cttctctgta gtagactcta 20 3 18 DNA Unknown misc_feature ()..() primer, non-genomic DNA 3 tagtagcagt ccmyrkwg 18 4 20 DNA Unknown misc_feature ()..() primer, non-genomic DNA 4 tctctaattt gtcctatgat 20 5 19 DNA Unknown misc_feature ()..() primer, non-genomic DNA 5 atgcccattg gatgaggac 19 6 21 DNA Unknown misc_feature ()..() primer, non-genomic DNA 6 gactgtggct accttttcac t 21 7 18 DNA Unknown misc_feature ()..() primer, non-genomic DNA 7 catcggtgaa taatcagg 18 8 19 DNA Unknown misc_feature ()..() primer, non-genomic DNA 8 ggtattactt ctgcctcta 19 9 20 DNA Unknown misc_feature ()..() primer, non-genomic DNA 9 ctcaataaag cttgccttga 20 10 19 DNA Unknown misc_feature ()..() primer, non-genomic DNA 10 gtcctcatcc aatgggcat 19 11 19 DNA Unknown misc_feature ()..() primer, non-genomic DNA 11 trdctagaga tccctcaga 19 12 21 DNA Unknown misc_feature ()..() primer, non-genomic DNA 12 ccaatactgt gatctgttca c 21 13 18 DNA Unknown misc_feature ()..() primer, non-genomic DNA 13 cctattcatg gccaggta 18 14 21 DNA Unknown misc_feature ()..() primer, non-genomic DNA 14 gatttttctc tactctcact a 21 15 21 DNA Unknown misc_feature ()..() primer, non-genomic DNA 15 agtgaaaagg tagccacagt c 21 16 21 DNA Unknown misc_feature ()..() primer, non-genomic DNA 16 gatttttctc tactctcact a 21 17 279 DNA SIV - viral 17 agtagcagtc catgtagcca gtggatacct agaggcagaa gtaataccag cagagacagg 60 aaaagagaca gcacatttcc tgttaaagtt agcaggcagg tggcctgtaa aacatttaca 120 cactgacaat ggccccaact ttgtcagtga aaaggtagcc acagtctgtt ggtgggctca 180 aatagagcac accacaggtg taccctataa cccccagagt cagggagtag tggaagcaaa 240 gaatcatcat cttaagacaa tcataggaca aattagaga 279 18 195 DNA SIV - viral Unsure (175)..(175) “n” can be any base 18 gggattccgc anccggcagg tctaaaacaa tgtgaacaga tcacagtatt ggatatagga 60 gatgcctatt tttcatgccc attggatgag gactttagaa agtatactgc attcaccatt 120 ccatcggtga ataatcaggg gcccaggaat cagataccag tataatgtcc tcccncaggg 180 ntggaagggg tcccc 195 19 1551 DNA SIV - viral 19 catcggtgaa taatcagggc ccaggaatca gataccagta taatgtcctc ccacagggat 60 ggaaaggctc tccagcaatt tttcaggcaa cagctgataa aatcttgaaa acattcaaag 120 aagaatacca gaggtattaa tttatcagta tatggatgat ctgttcgtgg gaagtgactt 180 aaatgccact gaacataaca aaatgataaa caagttgaga gagcatctga gattctgggg 240 gctcgagacc ccagataaga agtttcaaaa ggaacctcct tttgaatgga tgggatatgt 300 gctacaccca aagaaatgga cagtgcagaa aatacaacta ccagaaaaag agcaatggac 360 agtgaatgat attcagaaat tggtaggaaa acttaattgg gcaagtcaga tatattccgg 420 aattaaaaca aaagagctct gtaaattgat cagaggagca aaacctctag atgaaatagt 480 agaatggaca agagaagcag aattagagta tgaagagaat aagataatag tgcaggagga 540 ggtgcatgga gtgtactatc agccagaaaa accactgatg gcaaaagtac aaaagttgac 600 acaaggacag tggagttatc aaatagagca agaagaaaac aaacctctca aggcaggaaa 660 atatgccagg acaaagaatg cccacacaaa tgagttaagg acacttgcag ggttagtaca 720 aaaaatagcc aaggaatgca tagtaatctg gggaagattg ccaaaatttt acctcccctt 780 ggagagagaa gtatgggatc aatggtggca tgattattgg caggtaacat ggatcccaga 840 gtgggaattc atctcaacac caccattgat aaggctatgg tacaacctcc tgaaagaacc 900 aattccagga gaagatgtat actatgtaga tggggcagct aacagaaatt ctaaagaagg 960 caaggcagga tactatacag caaggggcaa aagtaaggta atagctttag aaaatacaac 1020 caatcagaag gcagagctga aggcaataga attagcccta aaagattcag gaccaagagt 1080 aaacatagta acagattcac agtatgcatt aggcatactc acagcatccc cagatcagtc 1140 agataacccc atagttaggg aaataattaa cctcatgata gccaaggaag cagtctacct 1200 gtcatgggta ccagcccaca agggtatagg aggtaacgaa caaatagaca aattagtaag 1260 ccaaggaatt aggcaagtac tattcctgga aggaatagac agagctcagg aagaacacga 1320 caaatatcat aacaactgga gagctttagc tcaggaattc agcatacctc ctatagtggc 1380 aaaagagata gttgcacaat gcccaaaatg ccagataaaa ggggaaccta ttcatggcca 1440 ggtagatgca agtcctggga catggcaaat ggattgcacc catctagaag gaaaggtcat 1500 catagtggca gtccatgtag ccagtggata cctagaggca gaagtaatac c 1551 20 2500 DNA SIV - viral 20 trdctagaga tccctcagat ttgtgccaga cttctgatat ctagtgagag tagagaaaaa 60 tctccagcag tggcgcccga acagggactt gacgaagagc caagtcattc ccacctgtga 120 gggacagcgg cggcagccrg ccggaccgac ccacccggtg aagtgagtta accaaggagc 180 cccgacgcgc aggacacaag gtaagcggtg caccgtgctg tagtgagtgt gtgtccagga 240 tccgcttgag caggcgagat cgccgaggca accccagtag aaaaagaaaa gaggggaagt 300 aaggccgagg caaagtgaaa gtaaaagaga tcctctgaga agaggaacag ggggcaataa 360 aattggcgcg agcgcgtcag gacttagggg aagagaattg gatgagctgg aaaagattag 420 gttacggccc tccggaaaga aaaaatacca gctaaaacat gtgatatggg taagcaagga 480 actagataga tttggcctac atgaaaagtt gttagaaacc aaggaaggat gcgaaaaaat 540 tcttagcgta ctctttcctc tagttcctac agggtcagaa aatttaattt cgctgtacaa 600 cacctgctgt tgcatttggt gcgtacatgc gaaagtgaaa gtagcagata cagaagaggc 660 aaaagagaaa gtaaracaat gctaccatct agtggttgaa aaacagaatg cagcctcaga 720 aaaagaaaaa ggagcaacag tgacacctag tggccactca araaattacc ccattcagat 780 agtaaatcaa accccagtac accagggaat ttctcccaga acactgaatg cttgggtaaa 840 atgtatagag gagaagaaat tcagcccaga aatagtgcct atgttcatag ctttgtcaga 900 aggatgcctc ccatacgacc tcaacggcat gctcaatgcc attggggacc atcagggagc 960 tctccaaata gtgaaagatg tcatcaatga cgaagctgca gactgggatc ttagacatcc 1020 tcagatgggg cctatgcccc aaggggtgct aagaaaccca acagggagtg acatagcagg 1080 aaccaccagc agcatagaag aacaaattga atggacaact aggcagcaag atcaggtaaa 1140 tgtaggagga atttacaaac aatggatagt tctgggattg caaaaatgtg tgagcatgta 1200 caatccagtg aatattctag atataaaaca gggaccaaaa gaacccttta aggactatgt 1260 ggatcgattt tacaaagctc tgcgggcgga gcgaacagat ccacaagtga aaaactggat 1320 gacgcagaca ttgctcatcc agaatgcaaa cccagattgt aaagccattc ttaagggatt 1380 aggcatgaac cccaccttgg aagaaatgtt attggcatgt caaggagtag ggggaccaaa 1440 gtataaagct caaatgatgg cagaagcaat gcaggaggtg caaggaaaaa ttatgatgca 1500 agcctcggga ggaccaccgc ggggtccccc aaggcagcca cccagaaatc ctagatgccc 1560 caactgtgga aagtttggac atgtactgag agactgtaga gccccaagaa agcgaggatg 1620 cttcaagtgt ggagatccag gacatctgat gagaaactgc ccaaagatgg tgaatttttt 1680 agggaatgct ccytggggca gtggcaaacc caggaacttt cctgccgtgc cactgacccc 1740 aacggcaccc ccgatgccag gattagagga yccagcagag argatgctrc tggattacat 1800 gaagaagggg caacagatga aggcagagag ggaagccaaa cgggagaagg acaaaggccc 1860 ttacgaggcg gcttacaact ccctcagttc tctctttgga acagaccaac tacagtagta 1920 gagatagagg ggcaaaaagt ggaggcccta ctagatacag gagcagatga cacagtaatc 1980 aaagatttac aattaacagg caattggaaa ccacaaatca taggaggaat tggaggagca 2040 attagggtaa agcaatattt caattgtaaa ataacagtgg caggtaaaag cactcatgct 2100 tcagtactag tgggccccac tcctgtaaat attataggta gaaatgtact taaaaagtta 2160 ggatgtactt tgaactttcc tattagtaar atagaaacag taaaggtaac actaaaacca 2220 ggaactgatg gaccaagaat caaacagtgg ccactgtcta aagaaaagat tttagcctta 2280 caagaaatat gcaatcagat ggaaaaagaa ggcaaaatct ctagaatagg tccagaaaat 2340 ccttacaaca caccagtgtt ttgtataaaa aagaaagatg gagccagctg gagaaaactg 2400 gtagatttta gacaattgaa taaagtgaca caggatttct ttgaggtgca gctaggaatc 2460 ccacatcctg gaggtctaaa acaatgtgaa cagatcacag 2500 21 5428 DNA SIV - viral 21 agtgaaaagg tagccacagt ctgttggtgg gctcaaatag agcacaccac aggtgtaccc 60 tataaccccc agagtcaggg agtagtggaa gcaaagaatc atcatcttaa gacaatcata 120 gaacaagtta gggatcaagc agaaaaatta gaaacagcag tacaaatggc agtattaata 180 cacaatttta aaagaaaagg ggggataggg gagtatagtc caggagaaag aatagtagat 240 atcataacca cagacattct aacaactaaa ttacaacaaa atatttcaaa aattcaaaat 300 tttcgggttt attacagaga aggaagggat caacagtgga aaggaccagc agaactcatt 360 tggaaaggag aaggcgctgt ggtgattaaa gaagggacag acttaaaggt ggtaccaaga 420 agaaaagcca aaatcatcag agattatgga aaagcagtgg atagtaattc ccacatggag 480 agtagagagg aatcagcttg agaaatggaa ttcattagta aaatatcata aatatagggg 540 agaaaaatac ctagaaagat gggaactata ccaccatttc caatgctcgg ggtggtggac 600 acactctaga aaagatgttt actttaaaga tggctcagta ataagcatta ctgccttctg 660 gaatcttacc ccagagaaag gatggttgtc tcaatatgca gttacaatag aatatgtaaa 720 agaaagctat tatacttaca tagacccagt tacagcagac agaatgattc attgggaata 780 tttcccatgt tttacagccc aggctgtgag aaaagtactg tttggagaaa gactaatagc 840 ttgctacagc ccctggggac acaaaggaca ggtagggact ctacaattcc tggctttgca 900 agcttacctt cagtattgta aacatggcag aaagagcacc agaagtgccg gaaggggcag 960 gagagatacc tctagaacag tggctagaaa gatcattaga acaactcaac agagaggccc 1020 ggttacactt ccacccagag ttccttttcc gtctttggaa cacttgtgta gaacattggc 1080 atgatagaca ccagaggagc ctggagtatg caaaatacag atatcttttg ttggtgcata 1140 aggccatgtt tacccatatg caacagggat gcccatgtag aaatgggcac ccaagaggac 1200 ctcctcctcc aggattggcc taatttctgt cttgcagatg gaacagccac ctgaggacga 1260 ggctccacag agagaacctt ataatgaatg gctgatagat accttggcag aaatccagga 1320 agaagctttg aagcattttg ataggcgctt gctacatgca gtaggctcat gggtgtatga 1380 gcaacaggga gacaccttag aaggtgtcca aaagctaata actattctac aaagagcttt 1440 gtttttgcac ttcaggcatg gatgcaggga aagccgcatt ggacaagcag gagggaaata 1500 taattccctc agatcctttc caaggccaga caaccccttg taataaatgc tattgtaaaa 1560 gatgttgcta tcactgccag ttatgcttct tgcagaaagc cttagggata cattatcatg 1620 tctacagagt caggagacct cgacagagat ttttgggcga agtaccacca catagtgcag 1680 caactgtgga aaggtaagta aaaagtaagt agacatgctt agatatatag ttttaggaat 1740 agtcatagga ttagggatag gacaccaatg ggttacagtg tattatggaa cacctaaatg 1800 gcacccagct aggacacatc tcttttgtgc aacagataat aattcctttt gggtcacaac 1860 aagttgtgtg cccagcctat tgcactatga agaacaacac attcccaaca taacagaaaa 1920 cttcacaggc cccataacag agaatgaagt aataagacaa gcatggggag ctatctcttc 1980 catgatagat gcagtcttaa aaccctgtgt aaagctgaca ccatattgtg tcaagatgaa 2040 atgcacaaag ggagatactg atactacaga aaggacaaca tcaaccactt cctcttggtc 2100 cacatccacc ccaacctcta cccctatgac tcccaatacc actggattag atatagactc 2160 aaacaataca gaacccacaa cacaagagaa tcggatatgt aaatttaata ctacaggatt 2220 atgtagagac tgcagattgg aaatagaaga aaacttcaga tatcaggata taacatgtag 2280 aaatagtagt gaagatactg aagagtgcta tatgacacat tgtaactcat cagtaataac 2340 acaggattgc aataaggcat caacagataa aatgactttt aggttgtgtg caccaccagg 2400 atatgtcctg ttgagatgta gagaaaagct aaaccaaacc aaattgtgtg gcaatattac 2460 agcagtgcaa tgcactgacc caatgcctgc aactatatcc actatgtttg gatttaatgg 2520 gaccaaacat gactatgatg agctaatttt aacaaaccct caaaagataa atgagtttca 2580 tgatcacaag tatgtatata gagttgataa aaaatggaag ctacaggtag tatgtagaag 2640 aaaagggaat agatcaataa tatcaacgcc aagtgctacg ggcttattgt tctatcatgg 2700 gctagaacca gggaaaaatt taaaaaaggg gatgtgccag ctgaagggat tatggggaaa 2760 ggccatgcac caactatcag aggaacttag aaagataaat ggaagtattt atagaaaatg 2820 gaatgagaca gcaggctgca gaaagctaaa caaacagaac ggtacaggtt gctcattgaa 2880 aacaatagaa gttagtgagt acaccacgga gggcgatccg ggggcagaga caattatgct 2940 tctttgtgga ggtgagtatt tcttttgtaa ttggacaaag atttggaaga catggaataa 3000 ccagacgtca aatgtctggt atccttggat gtcatgcaat attagacaaa ttgtagatga 3060 ttggcataaa gtagggaaaa aaatttatat gcctcctgca agtggattta acaatgagat 3120 aaggtgtact aatgatgtca cggaaatgtt ctttgaggtt cagaagaagg aagagaataa 3180 atatttaata aagtttatac ctcaagatga gatacaaaat cagtatacag cagtaggagc 3240 acattataaa ttggtgaaag tggatcctat agggttcgca cccacagatg tgcatagata 3300 ccatctacca gatgtaaagc agaagagagg agcagtcttg cttggaatgc tcggcctctt 3360 aggtttggca ggttccgcga tgggctcagt ggcgatagca ctgacggtcc agtcccaggc 3420 tttattgaat gggattgtgg agcagcagaa ggttctgctg agcctgatag atcagcactc 3480 cgagttatta aaactaacta tctggggtgt aaaaaatctt caggcccgcc tcacagcctt 3540 ggaggaatac gtagcggacc aatcaagact ggcagtatgg ggatgctcat tctctcaagt 3600 atgccacact aatgtaaagt ggcctaatga ttcaatagtt cctaactgga cctcggaaac 3660 atggcttgaa tgggataaaa gagtgacagc aattacaaca aatatgacaa tagacttgca 3720 gagggcatat gaattggaac aaaagaatat gtttgagctt caaaaattag gagatctcac 3780 ctcctgggcc agctggttcg acctcacgtg gtggtttaaa tatattaaga taggaattct 3840 tataataata gtgataatag gacttagaat attagcttgc ttatggtcag tattaggcag 3900 gtttaggcag ggttaccgcc ctcttcctta tgtcttcaag ggagactatc accgacccca 3960 caacctcaaa cagccagaca aagaaagagg agaagagcaa gacagagaaa aacagaacat 4020 cagctcagag aattacaggc caggatctgg cagagcttgg agcaaagagc aagtagagac 4080 ctggtggaag gagtccaggc tctacatttg gttgaagagc acacaagcag taattgaata 4140 tgggtggcaa gagctcaaag cagcaggagc agaaatatat aaaatattac agagcgctgc 4200 gcagaggcta tggagcggag ggcaccaact cggactatca tgtattagag cagctacagc 4260 ctttggcaga ggagtcagaa acattcctag acgcatcaga caaggagcag aagtcttact 4320 caactgagtt agacttaaga catcaacaag atgtaagcct ccccacagaa gaagaacagc 4380 cttgggaaga ggaagaggag gtaggctttc cagtctaccc acgacagcct gtgcatgaag 4440 ccacctataa agacttgata gacctgtccc actttttaaa agaaaagggg ggactggaag 4500 ggatttggtg gtctaaaaga agagaagaaa tcttggatat atatgcacaa aatgaatggg 4560 gaattatacc tgactggcag gcttacactt caggaccggg gatcaggtat ccaaaagcat 4620 ttgggttcct gtttaaactg atcccagtgg cagttccacc ggaacaagag aacaatgaat 4680 gcaataggct gctaaactct tctcagacag gaatccagga agatccatgg ggagaaaggc 4740 tcatgtggaa gtttgactct gctcttgcct atactttcta tgctcccata aagaggccag 4800 gagacttcaa gcatgtccaa agtcttagct atgaagctta taagaaggaa cctgactgct 4860 gcaagaggaa gtggtggcgc ttctagccga ccacagaggg ttgctatggc gatacccttt 4920 aaaactgcta actctggagg gactttccac tagtgcatgc gcactggact ggggactttc 4980 caggatgacg ccgggtgggg gagtggtcag cccaatctgg ctgcatataa gcagctcgct 5040 ttgcgcttgt attgagtctc tccctgagag gctaccagat tgagcctagg ttgttctctg 5100 gtgagtcctt gaaggagtgc ctgcttgtag ccctgggcgg ttcgcaggcc cctggcttgt 5160 agctctgggt agctcgtcag gtgttctgga aaggtcttgc taaggggacg cctttgcttg 5220 gtcttggtag acctctagca gtctcagtgg ccaggaggct gtgggattga ctaccgcttg 5280 cttgcctttg atgctcaata aagcttaccc gaattagaaa ggcattcaag tgtactcgct 5340 cattttgtct ttggtagaaa ctctggttac tggagatccc tcagatttgt gccagacttc 5400 tgatatctag tgagagtaga gaaaaatc 5428 22 9641 DNA SIV - viral 22 trdctagaga tccctcagat ttgtgccaga cttctgatat ctagtgagag tagagaaaaa 60 tctccagcag tggcgcccga acagggactt gacgaagagc caagtcattc ccacctgtga 120 gggacagcgg cggcagccgg ccggaccgac ccacccggtg aagtgagtta accaaggagc 180 cccgacgcgc aggacacaag gtaagcggtg caccgtgctg tagtgagtgt gtgtccagga 240 tccgcttgag caggcgagat cgccgaggca accccagtag aaaaagaaaa gaggggaagt 300 aaggccgagg caaagtgaaa gtaaaagaga tcctctgaga agaggaacag ggggcaataa 360 aattggcgcg agcgcgtcag gacttagggg aagagaattg gatgagctgg aaaagattag 420 gttacggccc tccggaaaga aaaaatacca gctaaaacat gtgatatggg taagcaagga 480 actagataga tttggcctac atgaaaagtt gttagaaacc aaggaaggat gcgaaaaaat 540 tcttagcgta ctctttcctc tagttcctac agggtcagaa aatttaattt cgctgtacaa 600 cacctgctgt tgcatttggt gcgtacatgc gaaagtgaaa gtagcagata cagaagaggc 660 aaaagagaaa gtaaaacaat gctaccatct agtggttgaa aaacagaatg cagcctcaga 720 aaaagaaaaa ggagcaacag tgacacctag tggccactca agaaattacc ccattcagat 780 agtaaatcaa accccagtac accagggaat ttctcccaga acactgaatg cttgggtaaa 840 atgtatagag gagaagaaat tcagcccaga aatagtgcct atgttcatag ctttgtcaga 900 aggatgcctc ccatacgacc tcaacggcat gctcaatgcc attggggacc atcagggagc 960 tctccaaata gtgaaagatg tcatcaatga cgaagctgca gactgggatc ttagacatcc 1020 tcagatgggg cctatgcccc aaggggtgct aagaaaccca acagggagtg acatagcagg 1080 aaccaccagc agcatagaag aacaaattga atggacaact aggcagcaag atcaggtaaa 1140 tgtaggagga atttacaaac aatggatagt tctgggattg caaaaatgtg tgagcatgta 1200 caatccagtg aatattctag atataaaaca gggaccaaaa gaacccttta aggactatgt 1260 ggatcgattt tacaaagctc tgcgggcgga gcgaacagat ccacaagtga aaaactggat 1320 gacgcagaca ttgctcatcc agaatgcaaa cccagattgt aaagccattc ttaagggatt 1380 aggcatgaac cccaccttgg aagaaatgtt attggcatgt caaggagtag ggggaccaaa 1440 gtataaagct caaatgatgg cagaagcaat gcaggaggtg caaggaaaaa ttatgatgca 1500 agcctcggga ggaccaccgc ggggtccccc aaggcagcca cccagaaatc ctagatgccc 1560 caactgtgga aagtttggac atgtactgag agactgtaga gccccaagaa agcgaggatg 1620 cttcaagtgt ggagatccag gacatctgat gagaaactgc ccaaagatgg tgaatttttt 1680 agggaatgct ccctggggca gtggcaaacc caggaacttt cctgccgtgc cactgacccc 1740 aacggcaccc ccgatgccag gattagagga cccagcagag aagatgctac tggattacat 1800 gaagaagggg caacagatga aggcagagag ggaagccaaa cgggagaagg acaaaggccc 1860 ttacgaggcg gcttacaact ccctcagttc tctctttgga acagaccaac tacagtagta 1920 gagatagagg ggcaaaaagt ggaggcccta ctagatacag gagcagatga cacagtaatc 1980 aaagatttac aattaacagg caattggaaa ccacaaatca taggaggaat tggaggagca 2040 attagggtaa agcaatattt caattgtaaa ataacagtgg caggtaaaag cactcatgct 2100 tcagtactag tgggccccac tcctgtaaat attataggta gaaatgtact taaaaagtta 2160 ggatgtactt tgaactttcc tattagtaag atagaaacag taaaggtaac actaaaacca 2220 ggaactgatg gaccaagaat caaacagtgg ccactgtcta aagaaaagat tttagcctta 2280 caagaaatat gcaatcagat ggaaaaagaa ggcaaaatct ctagaatagg tccagaaaat 2340 ccttacaaca caccagtgtt ttgtataaaa aagaaagatg gagccagctg gagaaaactg 2400 gtagatttta gacaattgaa taaagtgaca caggatttct ttgaggtgca gctaggaatc 2460 ccacatcctg gaggtctaaa acaatgtgaa cagatcacag tattggatat aggagatgcc 2520 tatttttcat gcccattgga tgaggacttt agaaagtata ctgcattcac cattccatcg 2580 gtgaataatc agggcccagg aatcagatac cagtataatg tcctcccaca gggatggaaa 2640 ggctctccag caatttttca ggcaacagct gataaaatct tgaaaacatt caaagaagaa 2700 tacccagagg tattaattta tcagtatatg gatgatctgt tcgtgggaag tgacttaaat 2760 gccactgaac ataacaaaat gataaacaag ttgagagagc atctgagatt ctgggggctc 2820 gagaccccag ataagaagtt tcaaaaggaa cctccttttg aatggatggg atatgtgcta 2880 cacccaaaga aatggacagt gcagaaaata caactaccag aaaaagagca atggacagtg 2940 aatgatattc agaaattggt aggaaaactt aattgggcaa gtcagatata ttccggaatt 3000 aaaacaaaag agctctgtaa attgatcaga ggagcaaaac ctctagatga aatagtagaa 3060 tggacaagag aagcagaatt agagtatgaa gagaataaga taatagtgca ggaggaggtg 3120 catggagtgt actatcagcc agaaaaacca ctgatggcaa aagtacaaaa gttgacacaa 3180 ggacagtgga gttatcaaat agagcaagaa gaaaacaaac ctctcaaggc aggaaaatat 3240 gccaggacaa agaatgccca cacaaatgag ttaaggacac ttgcagggtt agtacaaaaa 3300 atagccaagg aatgcatagt aatctgggga agattgccaa aattttacct ccccttggag 3360 agagaagtat gggatcaatg gtggcatgat tattggcagg taacatggat cccagagtgg 3420 gaattcatct caacaccacc attgataagg ctatggtaca acctcctgaa agaaccaatt 3480 ccaggagaag atgtatacta tgtagatggg gcagctaaca gaaattctaa agaaggcaag 3540 gcaggatact atacagcaag gggcaaaagt aaggtaatag ctttagaaaa tacaaccaat 3600 cagaaggcag agctgaaggc aatagaatta gccctaaaag attcaggacc aagagtaaac 3660 atagtaacag attcacagta tgcattaggc atactcacag catccccaga tcagtcagat 3720 aaccccatag ttagggaaat aattaacctc atgatagcca aggaagcagt ctacctgtca 3780 tgggtaccag cccacaaggg tataggaggt aacgaacaaa tagacaaatt agtaagccaa 3840 ggaattaggc aagtactatt cctggaagga atagacagag ctcaggaaga acacgacaaa 3900 tatcataaca actggagagc tttagctcag gaattcagca tacctcctat agtggcaaaa 3960 gagatagttg cacaatgccc aaaatgccag ataaaagggg aacctattca tggccaggta 4020 gatgcaagtc ctgggacatg gcaaatggat tgcacccatc tagaaggaaa ggtcatcata 4080 gtggcagtcc atgtagccag tggataccta gaggcagaag taataccagc agagacagga 4140 aaagagacag cacatttcct gttaaagtta gcaggcaggt ggcctgtaaa acatttacac 4200 actgacaatg gccccaactt tgtcagtgaa aaggtagcca cagtctgttg gtgggctcaa 4260 atagagcaca ccacaggtgt accctataac ccccagagtc agggagtagt ggaagcaaag 4320 aatcatcatc ttaagacaat catagaacaa gttagggatc aagcagaaaa attagaaaca 4380 gcagtacaaa tggcagtatt aatacacaat tttaaaagaa aaggggggat aggggagtat 4440 agtccaggag aaagaatagt agatatcata accacagaca ttctaacaac taaattacaa 4500 caaaatattt caaaaattca aaattttcgg gtttattaca gagaaggaag ggatcaacag 4560 tggaaaggac cagcagaact catttggaaa ggagaaggcg ctgtggtgat taaagaaggg 4620 acagacttaa aggtggtacc aagaagaaaa gccaaaatca tcagagatta tggaaaagca 4680 gtggatagta attcccacat ggagagtaga gaggaatcag cttgagaaat ggaattcatt 4740 agtaaaatat cataaatata ggggagaaaa atacctagaa agatgggaac tataccacca 4800 tttccaatgc tcggggtggt ggacacactc tagaaaagat gtttacttta aagatggctc 4860 agtaataagc attactgcct tctggaatct taccccagag aaaggatggt tgtctcaata 4920 tgcagttaca atagaatatg taaaagaaag ctattatact tacatagacc cagttacagc 4980 agacagaatg attcattggg aatatttccc atgttttaca gcccaggctg tgagaaaagt 5040 actgtttgga gaaagactaa tagcttgcta cagcccctgg ggacacaaag gacaggtagg 5100 gactctacaa ttcctggctt tgcaagctta ccttcagtat tgtaaacatg gcagaaagag 5160 caccagaagt gccggaaggg gcaggagaga tacctctaga acagtggcta gaaagatcat 5220 tagaacaact caacagagag gcccggttac acttccaccc agagttcctt ttccgtcttt 5280 ggaacacttg tgtagaacat tggcatgata gacaccagag gagcctggag tatgcaaaat 5340 acagatatct tttgttggtg cataaggcca tgtttaccca tatgcaacag ggatgcccat 5400 gtagaaatgg gcacccaaga ggacctcctc ctccaggatt ggcctaattt ctgtcttgca 5460 gatggaacag ccacctgagg acgaggctcc acagagagaa ccttataatg aatggctgat 5520 agataccttg gcagaaatcc aggaagaagc tttgaagcat tttgataggc gcttgctaca 5580 tgcagtaggc tcatgggtgt atgagcaaca gggagacacc ttagaaggtg tccaaaagct 5640 aataactatt ctacaaagag ctttgttttt gcacttcagg catggatgca gggaaagccg 5700 cattggacaa gcaggaggga aatataattc cctcagatcc tttccaaggc cagacaaccc 5760 cttgtaataa atgctattgt aaaagatgtt gctatcactg ccagttatgc ttcttgcaga 5820 aagccttagg gatacattat catgtctaca gagtcaggag acctcgacag agatttttgg 5880 gcgaagtacc accacatagt gcagcaactg tggaaaggta agtaaaaagt aagtagacat 5940 gcttagatat atagttttag gaatagtcat aggattaggg ataggacacc aatgggttac 6000 agtgtattat ggaacaccta aatggcaccc agctaggaca catctctttt gtgcaacaga 6060 taataattcc ttttgggtca caacaagttg tgtgcccagc ctattgcact atgaagaaca 6120 acacattccc aacataacag aaaacttcac aggccccata acagagaatg aagtaataag 6180 acaagcatgg ggagctatct cttccatgat agatgcagtc ttaaaaccct gtgtaaagct 6240 gacaccatat tgtgtcaaga tgaaatgcac aaagggagat actgatacta cagaaaggac 6300 aacatccacc acttcctctt ggtccacatc caccccaacc tctaccccta tgactcccaa 6360 taccactgga ttagatatag actcaaacaa tacagaaccc acaacacaag agaatcggat 6420 atgtaaattt aatactacag gattatgtag agactgcaga ttggaaatag aagaaaactt 6480 cagatatcag gatataacat gtagaaatag tagtgaagat actgaagagt gctatatgac 6540 acattgtaac tcatcagtaa taacacagga ttgcaataag gcatcaacag ataaaatgac 6600 ttttaggttg tgtgcaccac caggatatgt cctgttgaga tgtagagaaa agctaaacca 6660 aaccaaattg tgtggcaata ttacagcagt gcaatgcact gacccaatgc ctgcaactat 6720 atccactatg tttggattta atgggaccaa acatgactat gatgagctaa ttttaacaaa 6780 ccctcaaaag ataaatgagt ttcatgatca caagtatgta tatagagttg ataaaaaatg 6840 gaagctacag gtagtatgta gaagaaaagg gaatagatca ataatatcaa cgccaagtgc 6900 tacgggctta ttgttctatc atgggctaga accagggaaa aatttaaaaa aggggatgtg 6960 ccagctgaag ggattatggg gaaaggccat gcaccaacta tcagaggaac ttagaaagat 7020 aaatggaagt atttatagaa aatggaatga gacagcaggc tgcagaaagc taaacaaaca 7080 gaacggtaca ggttgctcat tgaaaacaat agaagttagt gagtacacca cggagggcga 7140 tccgggggca gagacaatta tgcttctttg tggaggtgag tatttctttt gtaattggac 7200 aaagatttgg aagacatgga ataaccagac gtcaaatgtc tggtatcctt ggatgtcatg 7260 caatattaga caaattgtag atgattggca taaagtaggg aaaaaaattt atatgcctcc 7320 tgcaagtgga tttaacaatg agataaggtg tactaatgat gtcacggaaa tgttctttga 7380 ggttcagaag aaggaagaga ataaatattt aataaagttt atacctcaag atgagataca 7440 aaatcagtat acagcagtag gagcacatta taaattggtg aaagtggatc ctatagggtt 7500 cgcacccaca gatgtgcata gataccatct accagatgta aagcagaaga gaggagcagt 7560 cttgcttgga atgctcggcc tcttaggttt ggcaggttcc gcgatgggct cagtggcgat 7620 agcactgacg gtccagtccc aggctttatt gaatgggatt gtggagcagc agaaggttct 7680 gctgagcctg atagatcagc actccgagtt attaaaacta actatctggg gtgtaaaaaa 7740 tcttcaggcc cgcctcacag ccttggagga atacgtagcg gaccaatcaa gactggcagt 7800 atggggatgc tcattctctc aagtatgcca cactaatgta aagtggccta atgattcaat 7860 agttcctaac tggacctcgg aaacatggct tgaatgggat aaaagagtga cagcaattac 7920 aacaaatatg acaatagact tgcagagggc atatgaattg gaacaaaaga atatgtttga 7980 gcttcaaaaa ttaggagatc tcacctcctg ggccagctgg ttcgacctca cgtggtggtt 8040 taaatatatt aagataggaa ttcttataat aatagtgata ataggactta gaatattagc 8100 ttgcttatgg tcagtattag gcaggtttag gcagggttac cgccctcttc cttatgtctt 8160 caagggagac tatcaccgac cccacaacct caaacagcca gacaaagaaa gaggagaaga 8220 gcaagacaga gaaaaacaga acatcagctc agagaattac aggccaggat ctggcagagc 8280 ttggagcaaa gagcaagtag agacctggtg gaaggagtcc aggctctaca tttggttgaa 8340 gagcacacaa gcagtaattg aatatgggtg gcaagagctc aaagcagcag gagcagaaat 8400 atataaaata ttacagagcg ctgcgcagag gctatggagc ggagggcacc aactcggact 8460 atcatgtatt agagcagcta cagcctttgg cagaggagtc agaaacattc ctagacgcat 8520 cagacaagga gcagaagtct tactcaactg agttagactt aagacatcaa caagatgtaa 8580 gcctccccac agaagaagaa cagccttggg aagaggaaga ggaggtaggc tttccagtct 8640 acccacgaca gcctgtgcat gaagccacct ataaagactt gatagacctg tcccactttt 8700 taaaagaaaa ggggggactg gaagggattt ggtggtctaa aagaagagaa gaaatcttgg 8760 atatatatgc acaaaatgaa tggggaatta tacctgactg gcaggcttac acttcaggac 8820 cggggatcag gtatccaaaa gcatttgggt tcctgtttaa actgatccca gtggcagttc 8880 caccggaaca agagaacaat gaatgcaata ggctgctaaa ctcttctcag acaggaatcc 8940 aggaagatcc atggggagaa aggctcatgt ggaagtttga ctctgctctt gcctatactt 9000 tctatgctcc cataaagagg ccaggagact tcaagcatgt ccaaagtctt agctatgaag 9060 cttataagaa ggaacctgac tgctgcaaga ggaagtggtg gcgcttctag ccgaccacag 9120 agggttgcta tggcgatacc ctttaaaact gctaactctg gagggacttt ccactagtgc 9180 atgcgcactg gactggggac tttccaggat gacgccgggt gggggagtgg tcagcccaat 9240 ctggctgcat ataagcagct cgctttgcgc ttgtattgag tctctccctg agaggctacc 9300 agattgagcc taggttgttc tctggtgagt ccttgaagga gtgcctgctt gtagccctgg 9360 gcggttcgca ggcccctggc ttgtagctct gggtagctcg tcaggtgttc tggaaaggtc 9420 ttgctaaggg gacgcctttg cttggtcttg gtagacctct agcagtctca gtggccagga 9480 ggctgtggga ttgactaccg cttgcttgcc tttgatgctc aataaagctt acccgaatta 9540 gaaaggcatt caagtgtact cgctcatttt gtctttggta gaaactctgg ttactggaga 9600 tccctcagat ttgtgccaga cttctgatat ctagtgagag t 9641 23 518 PRT SIV - viral 23 Ile Gly Ala Ser Ala Ser Gly Leu Arg Gly Arg Glu Leu Asp Glu Leu 1 5 10 15 Glu Lys Ile Arg Leu Arg Pro Ser Gly Lys Lys Lys Tyr Gln Leu Lys 20 25 30 His Val Ile Trp Val Ser Lys Glu Leu Asp Arg Phe Gly Leu His Glu 35 40 45 Lys Leu Leu Glu Thr Lys Glu Gly Cys Glu Lys Ile Leu Ser Val Leu 50 55 60 Phe Pro Leu Val Pro Thr Gly Ser Glu Asn Leu Ile Ser Leu Tyr Asn 65 70 75 80 Thr Cys Cys Cys Ile Trp Cys Val His Ala Lys Val Lys Val Ala Asp 85 90 95 Thr Glu Glu Ala Lys Glu Lys Val Lys Gln Cys Tyr His Leu Val Val 100 105 110 Glu Lys Gln Asn Ala Ala Ser Glu Lys Glu Lys Gly Ala Thr Val Thr 115 120 125 Pro Ser Gly His Ser Arg Asn Tyr Pro Ile Gln Ile Val Asn Gln Thr 130 135 140 Pro Val His Gln Gly Ile Ser Pro Arg Thr Leu Asn Ala Trp Val Lys 145 150 155 160 Cys Ile Glu Glu Lys Lys Phe Ser Pro Glu Ile Val Pro Met Phe Ile 165 170 175 Ala Leu Ser Glu Gly Cys Leu Pro Tyr Asp Leu Asn Gly Met Leu Asn 180 185 190 Ala Ile Gly Asp His Gln Gly Ala Leu Gln Ile Val Lys Asp Val Ile 195 200 205 Asn Asp Glu Ala Ala Asp Trp Asp Leu Arg His Pro Gln Met Gly Pro 210 215 220 Met Pro Gln Gly Val Leu Arg Asn Pro Thr Gly Ser Asp Ile Ala Gly 225 230 235 240 Thr Thr Ser Ser Ile Glu Glu Gln Ile Glu Trp Thr Thr Arg Gln Gln 245 250 255 Asp Gln Val Asn Val Gly Gly Ile Tyr Lys Gln Trp Ile Val Leu Gly 260 265 270 Leu Gln Lys Cys Val Ser Met Tyr Asn Pro Val Asn Ile Leu Asp Ile 275 280 285 Lys Gln Gly Pro Lys Glu Pro Phe Lys Asp Tyr Val Asp Arg Phe Tyr 290 295 300 Lys Ala Leu Arg Ala Glu Arg Thr Asp Pro Gln Val Lys Asn Trp Met 305 310 315 320 Thr Gln Thr Leu Leu Ile Gln Asn Ala Asn Pro Asp Cys Lys Ala Ile 325 330 335 Leu Lys Gly Leu Gly Met Asn Pro Thr Leu Glu Glu Met Leu Leu Ala 340 345 350 Cys Gln Gly Val Gly Gly Pro Lys Tyr Lys Ala Gln Met Met Ala Glu 355 360 365 Ala Met Gln Glu Val Gln Gly Lys Ile Met Met Gln Ala Ser Gly Gly 370 375 380 Pro Pro Arg Gly Pro Pro Arg Gln Pro Pro Arg Asn Pro Arg Cys Pro 385 390 395 400 Asn Cys Gly Lys Phe Gly His Val Leu Arg Asp Cys Arg Ala Pro Arg 405 410 415 Lys Arg Gly Cys Phe Lys Cys Gly Asp Pro Gly His Leu Met Arg Asn 420 425 430 Cys Pro Lys Met Val Asn Phe Leu Gly Asn Ala Pro Trp Gly Ser Gly 435 440 445 Lys Pro Arg Asn Phe Pro Ala Val Pro Leu Thr Pro Thr Ala Pro Pro 450 455 460 Met Pro Gly Leu Glu Asp Pro Ala Glu Lys Met Leu Leu Asp Tyr Met 465 470 475 480 Lys Lys Gly Gln Gln Met Lys Ala Glu Arg Glu Ala Lys Arg Glu Lys 485 490 495 Asp Lys Gly Pro Tyr Glu Ala Ala Tyr Asn Ser Leu Ser Ser Leu Phe 500 505 510 Gly Thr Asp Gln Leu Gln 515 24 1016 PRT SIV - viral 24 Phe Phe Arg Glu Cys Ser Leu Gly Gln Trp Gln Thr Gln Glu Leu Ser 1 5 10 15 Cys Arg Ala Thr Asp Pro Asn Gly Thr Pro Asp Ala Arg Ile Arg Gly 20 25 30 Pro Ser Arg Glu Asp Ala Thr Gly Leu His Glu Glu Gly Ala Thr Asp 35 40 45 Glu Gly Arg Glu Gly Ser Gln Thr Gly Glu Gly Gln Arg Pro Leu Arg 50 55 60 Gly Gly Leu Gln Leu Pro Gln Phe Ser Leu Trp Asn Arg Pro Thr Thr 65 70 75 80 Val Val Glu Ile Glu Gly Gln Lys Val Glu Ala Leu Leu Asp Thr Gly 85 90 95 Ala Asp Asp Thr Val Ile Lys Asp Leu Gln Leu Thr Gly Asn Trp Lys 100 105 110 Pro Gln Ile Ile Gly Gly Ile Gly Gly Ala Ile Arg Val Lys Gln Tyr 115 120 125 Phe Asn Cys Lys Ile Thr Val Ala Gly Lys Ser Thr His Ala Ser Val 130 135 140 Leu Val Gly Pro Thr Pro Val Asn Ile Ile Gly Arg Asn Val Leu Lys 145 150 155 160 Lys Leu Gly Cys Thr Leu Asn Phe Pro Ile Ser Lys Ile Glu Thr Val 165 170 175 Lys Val Thr Leu Lys Pro Gly Thr Asp Gly Pro Arg Ile Lys Gln Trp 180 185 190 Pro Leu Ser Lys Glu Lys Ile Leu Ala Leu Gln Glu Ile Cys Asn Gln 195 200 205 Met Glu Lys Glu Gly Lys Ile Ser Arg Ile Gly Pro Glu Asn Pro Tyr 210 215 220 Asn Thr Pro Val Phe Cys Ile Lys Lys Lys Asp Gly Ala Ser Trp Arg 225 230 235 240 Lys Leu Val Asp Phe Arg Gln Leu Asn Lys Val Thr Gln Asp Phe Phe 245 250 255 Glu Val Gln Leu Gly Ile Pro His Pro Gly Gly Leu Lys Gln Cys Glu 260 265 270 Gln Ile Thr Val Leu Asp Ile Gly Asp Ala Tyr Phe Ser Cys Pro Leu 275 280 285 Asp Glu Asp Phe Arg Lys Tyr Thr Ala Phe Thr Ile Pro Ser Val Asn 290 295 300 Asn Gln Gly Pro Gly Ile Arg Tyr Gln Tyr Asn Val Leu Pro Gln Gly 305 310 315 320 Trp Lys Gly Ser Pro Ala Ile Phe Gln Ala Thr Ala Asp Lys Ile Leu 325 330 335 Lys Thr Phe Lys Glu Glu Tyr Pro Glu Val Leu Ile Tyr Gln Tyr Met 340 345 350 Asp Asp Leu Phe Val Gly Ser Asp Leu Asn Ala Thr Glu His Asn Lys 355 360 365 Met Ile Asn Lys Leu Arg Glu His Leu Arg Phe Trp Gly Leu Glu Thr 370 375 380 Pro Asp Lys Lys Phe Gln Lys Glu Pro Pro Phe Glu Trp Met Gly Tyr 385 390 395 400 Val Leu His Pro Lys Lys Trp Thr Val Gln Lys Ile Gln Leu Pro Glu 405 410 415 Lys Glu Gln Trp Thr Val Asn Asp Ile Gln Lys Leu Val Gly Lys Leu 420 425 430 Asn Trp Ala Ser Gln Ile Tyr Ser Gly Ile Lys Thr Lys Glu Leu Cys 435 440 445 Lys Leu Ile Arg Gly Ala Lys Pro Leu Asp Glu Ile Val Glu Trp Thr 450 455 460 Arg Glu Ala Glu Leu Glu Tyr Glu Glu Asn Lys Ile Ile Val Gln Glu 465 470 475 480 Glu Val His Gly Val Tyr Tyr Gln Pro Glu Lys Pro Leu Met Ala Lys 485 490 495 Val Gln Lys Leu Thr Gln Gly Gln Trp Ser Tyr Gln Ile Glu Gln Glu 500 505 510 Glu Asn Lys Pro Leu Lys Ala Gly Lys Tyr Ala Arg Thr Lys Asn Ala 515 520 525 His Thr Asn Glu Leu Arg Thr Leu Ala Gly Leu Val Gln Lys Ile Ala 530 535 540 Lys Glu Cys Ile Val Ile Trp Gly Arg Leu Pro Lys Phe Tyr Leu Pro 545 550 555 560 Leu Glu Arg Glu Val Trp Asp Gln Trp Trp His Asp Tyr Trp Gln Val 565 570 575 Thr Trp Ile Pro Glu Trp Glu Phe Ile Ser Thr Pro Pro Leu Ile Arg 580 585 590 Leu Trp Tyr Asn Leu Leu Lys Glu Pro Ile Pro Gly Glu Asp Val Tyr 595 600 605 Tyr Val Asp Gly Ala Ala Asn Arg Asn Ser Lys Glu Gly Lys Ala Gly 610 615 620 Tyr Tyr Thr Ala Arg Gly Lys Ser Lys Val Ile Ala Leu Glu Asn Thr 625 630 635 640 Thr Asn Gln Lys Ala Glu Leu Lys Ala Ile Glu Leu Ala Leu Lys Asp 645 650 655 Ser Gly Pro Arg Val Asn Ile Val Thr Asp Ser Gln Tyr Ala Leu Gly 660 665 670 Ile Leu Thr Ala Ser Pro Asp Gln Ser Asp Asn Pro Ile Val Arg Glu 675 680 685 Ile Ile Asn Leu Met Ile Ala Lys Glu Ala Val Tyr Leu Ser Trp Val 690 695 700 Pro Ala His Lys Gly Ile Gly Gly Asn Glu Gln Ile Asp Lys Leu Val 705 710 715 720 Ser Gln Gly Ile Arg Gln Val Leu Phe Leu Glu Gly Ile Asp Arg Ala 725 730 735 Gln Glu Glu His Asp Lys Tyr His Asn Asn Trp Arg Ala Leu Ala Gln 740 745 750 Glu Phe Ser Ile Pro Pro Ile Val Ala Lys Glu Ile Val Ala Gln Cys 755 760 765 Pro Lys Cys Gln Ile Lys Gly Glu Pro Ile His Gly Gln Val Asp Ala 770 775 780 Ser Pro Gly Thr Trp Gln Met Asp Cys Thr His Leu Glu Gly Lys Val 785 790 795 800 Ile Ile Val Ala Val His Val Ala Ser Gly Tyr Leu Glu Ala Glu Val 805 810 815 Ile Pro Ala Glu Thr Gly Lys Glu Thr Ala His Phe Leu Leu Lys Leu 820 825 830 Ala Gly Arg Trp Pro Val Lys His Leu His Thr Asp Asn Gly Pro Asn 835 840 845 Phe Val Ser Glu Lys Val Ala Thr Val Cys Trp Trp Ala Gln Ile Glu 850 855 860 His Thr Thr Gly Val Pro Tyr Asn Pro Gln Ser Gln Gly Val Val Glu 865 870 875 880 Ala Lys Asn His His Leu Lys Thr Ile Ile Glu Gln Val Arg Asp Gln 885 890 895 Ala Glu Lys Leu Glu Thr Ala Val Gln Met Ala Val Leu Ile His Asn 900 905 910 Phe Lys Arg Lys Gly Gly Ile Gly Glu Tyr Ser Pro Gly Glu Arg Ile 915 920 925 Val Asp Ile Ile Thr Thr Asp Ile Leu Thr Thr Lys Leu Gln Gln Asn 930 935 940 Ile Ser Lys Ile Gln Asn Phe Arg Val Tyr Tyr Arg Glu Gly Arg Asp 945 950 955 960 Gln Gln Trp Lys Gly Pro Ala Glu Leu Ile Trp Lys Gly Glu Gly Ala 965 970 975 Val Val Ile Lys Glu Gly Thr Asp Leu Lys Val Val Pro Arg Arg Lys 980 985 990 Ala Lys Ile Ile Arg Asp Tyr Gly Lys Ala Val Asp Ser Asn Ser His 995 1000 1005 Met Glu Ser Arg Glu Glu Ser Ala 1010 1015 25 853 PRT SIV - viral 25 Gln Trp Val Thr Val Tyr Tyr Gly Thr Pro Lys Trp His Pro Ala Arg 1 5 10 15 Thr His Leu Phe Cys Ala Thr Asp Asn Asn Ser Phe Trp Val Thr Thr 20 25 30 Ser Cys Val Pro Ser Leu Leu His Tyr Glu Glu Gln His Ile Pro Asn 35 40 45 Ile Thr Glu Asn Phe Thr Gly Pro Ile Thr Glu Asn Glu Val Ile Arg 50 55 60 Gln Ala Trp Gly Ala Ile Ser Ser Met Ile Asp Ala Val Leu Lys Pro 65 70 75 80 Cys Val Lys Leu Thr Pro Tyr Cys Val Lys Met Lys Cys Thr Lys Gly 85 90 95 Asp Thr Asp Thr Thr Glu Arg Thr Thr Ser Thr Thr Ser Ser Trp Ser 100 105 110 Thr Ser Thr Pro Thr Ser Thr Pro Met Thr Pro Asn Thr Thr Gly Leu 115 120 125 Asp Ile Asp Ser Asn Asn Thr Glu Pro Thr Thr Gln Glu Asn Arg Ile 130 135 140 Cys Lys Phe Asn Thr Thr Gly Leu Cys Arg Asp Cys Arg Leu Glu Ile 145 150 155 160 Glu Glu Asn Phe Arg Tyr Gln Asp Ile Thr Cys Arg Asn Ser Ser Glu 165 170 175 Asp Thr Glu Glu Cys Tyr Met Thr His Cys Asn Ser Ser Val Ile Thr 180 185 190 Gln Asp Cys Asn Lys Ala Ser Thr Asp Lys Met Thr Phe Arg Leu Cys 195 200 205 Ala Pro Pro Gly Tyr Val Leu Leu Arg Cys Arg Glu Lys Leu Asn Gln 210 215 220 Thr Lys Leu Cys Gly Asn Ile Thr Ala Val Gln Cys Thr Asp Pro Met 225 230 235 240 Pro Ala Thr Ile Ser Thr Met Phe Gly Phe Asn Gly Thr Lys His Asp 245 250 255 Tyr Asp Glu Leu Ile Leu Thr Asn Pro Gln Lys Ile Asn Glu Phe His 260 265 270 Asp His Lys Tyr Val Tyr Arg Val Asp Lys Lys Trp Lys Leu Gln Val 275 280 285 Val Cys Arg Arg Lys Gly Asn Arg Ser Ile Ile Ser Thr Pro Ser Ala 290 295 300 Thr Gly Leu Leu Phe Tyr His Gly Leu Glu Pro Gly Lys Asn Leu Lys 305 310 315 320 Lys Gly Met Cys Gln Leu Lys Gly Leu Trp Gly Lys Ala Met His Gln 325 330 335 Leu Ser Glu Glu Leu Arg Lys Ile Asn Gly Ser Ile Tyr Arg Lys Trp 340 345 350 Asn Glu Thr Ala Gly Cys Arg Lys Leu Asn Lys Gln Asn Gly Thr Gly 355 360 365 Cys Ser Leu Lys Thr Ile Glu Val Ser Glu Tyr Thr Thr Glu Gly Asp 370 375 380 Pro Gly Ala Glu Thr Ile Met Leu Leu Cys Gly Gly Glu Tyr Phe Phe 385 390 395 400 Cys Asn Trp Thr Lys Ile Trp Lys Thr Trp Asn Asn Gln Thr Ser Asn 405 410 415 Val Trp Tyr Pro Trp Met Ser Cys Asn Ile Arg Gln Ile Val Asp Asp 420 425 430 Trp His Lys Val Gly Lys Lys Ile Tyr Met Pro Pro Ala Ser Gly Phe 435 440 445 Asn Asn Glu Ile Arg Cys Thr Asn Asp Val Thr Glu Met Phe Phe Glu 450 455 460 Val Gln Lys Lys Glu Glu Asn Lys Tyr Leu Ile Lys Phe Ile Pro Gln 465 470 475 480 Asp Glu Ile Gln Asn Gln Tyr Thr Ala Val Gly Ala His Tyr Lys Leu 485 490 495 Val Lys Val Asp Pro Ile Gly Phe Ala Pro Thr Asp Val His Arg Tyr 500 505 510 His Leu Pro Asp Val Lys Gln Lys Arg Gly Ala Val Leu Leu Gly Met 515 520 525 Leu Gly Leu Leu Gly Leu Ala Gly Ser Ala Met Gly Ser Val Ala Ile 530 535 540 Ala Leu Thr Val Gln Ser Gln Ala Leu Leu Asn Gly Ile Val Glu Gln 545 550 555 560 Gln Lys Val Leu Leu Ser Leu Ile Asp Gln His Ser Glu Leu Leu Lys 565 570 575 Leu Thr Ile Trp Gly Val Lys Asn Leu Gln Ala Arg Leu Thr Ala Leu 580 585 590 Glu Glu Tyr Val Ala Asp Gln Ser Arg Leu Ala Val Trp Gly Cys Ser 595 600 605 Phe Ser Gln Val Cys His Thr Asn Val Lys Trp Pro Asn Asp Ser Ile 610 615 620 Val Pro Asn Trp Thr Ser Glu Thr Trp Leu Glu Trp Asp Lys Arg Val 625 630 635 640 Thr Ala Ile Thr Thr Asn Met Thr Ile Asp Leu Gln Arg Ala Tyr Glu 645 650 655 Leu Glu Gln Lys Asn Met Phe Glu Leu Gln Lys Leu Gly Asp Leu Thr 660 665 670 Ser Trp Ala Ser Trp Phe Asp Leu Thr Trp Trp Phe Lys Tyr Ile Lys 675 680 685 Ile Gly Ile Leu Ile Ile Ile Val Ile Ile Gly Leu Arg Ile Leu Ala 690 695 700 Cys Leu Trp Ser Val Leu Gly Arg Phe Arg Gln Gly Tyr Arg Pro Leu 705 710 715 720 Pro Tyr Val Phe Lys Gly Asp Tyr His Arg Pro His Asn Leu Lys Gln 725 730 735 Pro Asp Lys Glu Arg Gly Glu Glu Gln Asp Arg Glu Lys Gln Asn Ile 740 745 750 Ser Ser Glu Asn Tyr Arg Pro Gly Ser Gly Arg Ala Trp Ser Lys Glu 755 760 765 Gln Val Glu Thr Trp Trp Lys Glu Ser Arg Leu Tyr Ile Trp Leu Lys 770 775 780 Ser Thr Gln Ala Val Ile Glu Tyr Gly Trp Gln Glu Leu Lys Ala Ala 785 790 795 800 Gly Ala Glu Ile Tyr Lys Ile Leu Gln Ser Ala Ala Gln Arg Leu Trp 805 810 815 Ser Gly Gly His Gln Leu Gly Leu Ser Cys Ile Arg Ala Ala Thr Ala 820 825 830 Phe Gly Arg Gly Val Arg Asn Ile Pro Arg Arg Ile Arg Gln Gly Ala 835 840 845 Glu Val Leu Leu Asn 850 26 32 PRT SIM27.ENV 26 Arg Leu Thr Ala Leu Glu Glu Tyr Val Ala Asp Gln Ser Arg Leu Ala 1 5 10 15 Val Trp Gly Cys Ser Phe Ser Gln Val Cys His Thr Asn Val Lys Trp 20 25 30 27 32 PRT SIV-Mandrill, MNDGB1 27 Arg Leu Thr Ser Leu Glu Asn Tyr Ile Lys Asp Gln Ala Leu Leu Ser 1 5 10 15 Gln Trp Gly Cys Ser Trp Ala Gln Val Cys His Thr Ser Val Glu Trp 20 25 30 28 32 PRT HIV1-N, YBF30 28 Lys Val Leu Ala Ile Glu Arg Tyr Leu Arg Asp Gln Gln Ile Leu Ser 1 5 10 15 Leu Trp Gly Cys Ser Gly Lys Thr Ile Cys Tyr Thr Thr Val Pro Trp 20 25 30 29 32 PRT HIV1-C, 96bw05.02 29 Arg Ile Leu Ala Val Glu Arg Tyr Leu Lys Asp Gln Gln Leu Leu Gly 1 5 10 15 Ile Trp Gly Cys Ser Gly Lys Leu Ile Cys Thr Thr Ala Val Pro Trp 20 25 30 30 32 PRT HIV1-O, ANT70C 30 Arg Leu Leu Ala Leu Glu Thr Leu Leu Gln Asn Gln Gln Leu Leu Ser 1 5 10 15 Leu Trp Gly Cys Lys Gly Lys Leu Val Cys Tyr Thr Ser Val Lys Trp 20 25 30 31 32 PRT SIV-CPZ, CPZGAB 31 Arg Leu Leu Ala Val Glu Arg Tyr Leu Gln Asp Gln Gln Ile Leu Gly 1 5 10 15 Leu Trp Gly Cys Ser Gly Lys Ala Val Cys Tyr Thr Thr Val Pro Trp 20 25 30 32 32 PRT HIV1-O, MVP5180 32 Arg Leu Gln Ala Leu Glu Thr Leu Ile Gln Asn Gln Gln Arg Leu Asn 1 5 10 15 Leu Trp Gly Cys Lys Gly Lys Leu Ile Cys Tyr Thr Ser Val Lys Trp 20 25 30 33 32 PRT SIV-1hoesti 33 Arg Leu Thr Ala Leu Glu Glu Tyr Val Lys His Gln Ala Leu Leu Ala 1 5 10 15 Ser Trp Gly Cys Gln Trp Lys Gln Val Cys His Thr Asn Val Glu Trp 20 25 30 34 32 PRT SIV-SYKES 34 Arg Leu Thr Ala Leu Glu Thr Tyr Leu Arg Asp Gln Ala Ile Leu Ser 1 5 10 15 Asn Trp Gly Cys Ala Phe Lys Gln Ile Cys His Thr Ala Val Thr Trp 20 25 30 35 32 PRT SIV-CPZ, CPZANT 35 Arg Met Leu Ala Val Glu Lys Tyr Leu Arg Asp Gln Gln Leu Leu Ser 1 5 10 15 Leu Trp Gly Cys Ala Asp Lys Val Thr Cys His Thr Thr Val Pro Trp 20 25 30 36 32 PRT SIV-CPZ-US 36 Arg Val Leu Ala Val Glu Arg Tyr Leu Lys Asp Gln Gln Ile Leu Gly 1 5 10 15 Leu Trp Gly Cys Ser Gly Lys Thr Ile Cys Tyr Thr Thr Val Pro Trp 20 25 30 37 32 PRT HIV1-F, 93br020.1 37 Arg Val Leu Ala Val Glu Arg Tyr Leu Lys Asp Gln Gln Leu Leu Gly 1 5 10 15 Leu Trp Gly Cys Ser Gly Lys Leu Ile Cys Thr Thr Asn Val Pro Trp 20 25 30 38 32 PRT HIV1-A, 92ug037 38 Arg Val Leu Ala Val Glu Arg Tyr Leu Arg Asp Gln Gln Leu Leu Gly 1 5 10 15 Ile Trp Gly Cys Ser Gly Lys Leu Ile Cys Pro Thr Asn Val Pro Trp 20 25 30 39 32 PRT HIV1-H, 90cr056 39 Arg Val Leu Ala Val Glu Arg Tyr Leu Arg Asp Gln Gln Leu Leu Gly 1 5 10 15 Ile Trp Gly Cys Ser Gly Lys Leu Ile Cys Thr Thr Asn Val Pro Trp 20 25 30 40 32 PRT HIV1-D, NDK 40 Arg Val Leu Ala Val Glu Arg Tyr Leu Arg Asp Gln Gln Leu Leu Gly 1 5 10 15 Ile Trp Gly Cys Ser Gly Arg His Ile Cys Thr Thr Asn Val Pro Trp 20 25 30 41 32 PRT HIV2-B, UC1 41 Arg Val Thr Ala Ile Glu Lys Tyr Leu Lys Asp Gln Ala Leu Leu Asn 1 5 10 15 Ser Trp Gly Cys Ala Phe Arg Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 42 32 PRT SIV-D, MNE 42 Arg Val Thr Ala Ile Glu Lys Tyr Leu Lys Asp Gln Ala Gln Leu Asn 1 5 10 15 Ala Trp Gly Cys Ala Phe Arg Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 43 32 PRT SIV-D, MM239 43 Arg Val Thr Ala Ile Glu Lys Tyr Leu Lys Asp Gln Ala Gln Leu Asn 1 5 10 15 Ala Trp Gly Cys Ala Phe Arg Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 44 32 PRT SIV, SME543 44 Arg Val Thr Ala Ile Glu Lys Tyr Leu Lys Asp Gln Ala Gln Leu Asn 1 5 10 15 Ser Trp Gly Cys Ala Phe Arg Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 45 32 PRT SIV-D, SMM-PBJ-6P9 45 Arg Val Thr Ala Ile Glu Lys Tyr Leu Lys Asp Gln Ala Gln Leu Asn 1 5 10 15 Ser Trp Gly Cys Ala Phe Arg Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 46 32 PRT SIV-D, STM 46 Arg Val Thr Ala Ile Glu Lys Tyr Leu Lys Asp Gln Ala Gln Leu Asn 1 5 10 15 Ser Trp Gly Cys Ala Phe Arg Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 47 32 PRT HIV2-A, CAM2 47 Arg Val Thr Ala Ile Glu Lys Tyr Leu Lys Asp Gln Ala Gln Leu Asn 1 5 10 15 Ser Trp Gly Cys Ala Phe Arg Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 48 32 PRT HIV2-A, GH1 48 Arg Val Thr Ala Ile Glu Lys Tyr Leu Lys Asp Gln Ala Gln Leu Asn 1 5 10 15 Ser Trp Gly Cys Ala Phe Arg Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 49 32 PRT HIV2-B, EHO 49 Arg Val Thr Ala Ile Glu Lys Tyr Leu Lys Asp Gln Ala Gln Leu Asn 1 5 10 15 Ser Trp Gly Cys Ala Phe Arg Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 50 32 PRT SIV-SMM, PGM 50 Arg Val Thr Ala Ile Glu Lys Tyr Arg Lys Asp Gln Ala Gln Leu Asn 1 5 10 15 Ser Trp Gly Cys Ala Phe Arg Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 51 32 PRT SIV-VERVET, AGM155 51 Arg Val Thr Ala Leu Glu Lys Tyr Leu Ala Asp Gln Ala Arg Leu Asn 1 5 10 15 Ala Trp Gly Cys Ala Trp Lys Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 52 32 PRT SIV-VERVET, AGM3 52 Arg Val Thr Ala Leu Glu Lys Tyr Leu Glu Asp Gln Ala Arg Leu Asn 1 5 10 15 Ala Trp Gly Cys Ala Trp Lys Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 53 32 PRT SIV-VERVET, AGMSAB1 53 Arg Val Thr Ala Leu Glu Lys Tyr Leu Glu Asp Gln Ala Arg Leu Asn 1 5 10 15 Ile Trp Gly Cys Ala Phe Arg Gln Val Cys His Thr Thr Val Leu Trp 20 25 30 54 32 PRT SIV-VERVET, AGMTY6 54 Arg Val Thr Ala Leu Glu Lys Tyr Leu Glu Asp Gln Ala Arg Leu Asn 1 5 10 15 Ser Trp Gly Cys Ala Trp Lys Gln Val Cys His Thr Thr Val Glu Trp 20 25 30 55 32 PRT SIV-GRIVET, AGM677A 55 Arg Val Thr Ala Leu Glu Lys Tyr Leu Glu Asp Gln Ala Arg Leu Asn 1 5 10 15 Ser Trp Gly Cys Ala Trp Lys Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 56 32 PRT SIV-VERVET, REV 56 Arg Val Thr Ala Leu Glu Lys Tyr Leu Glu Asp Gln Ala Arg Leu Asn 1 5 10 15 Val Trp Gly Cys Ala Trp Lys Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 57 32 PRT SIV-TANTALUS, TAN1 57 Arg Val Thr Ala Leu Glu Lys Tyr Leu Glu Asp Gln Thr Arg Leu Asn 1 5 10 15 Leu Trp Gly Cys Ala Phe Lys Gln Val Cys His Thr Thr Val Pro Trp 20 25 30 

1. The immunodeficiency virus SIM27, whose RNA or a part thereof is complementary to the sequence according to Table 2 or Table 3 or Table 4 or Table 5 or Table 6 or Table
 7. 2. A variant of the immunodeficiency virus as claimed in claim 1, comprising an RNA which is complementary to a DNA which is investigated by the following process: (a) extraction of the sequences mentioned in Table 11 from the gene database “Genbank” and loading of the sequences including the sequence to be investigated into the computer program “ClustalW Version 1.74”, (b) multiple alignment of the sequences according to Table 11 and of the sequence to be investigated and phylogenetic analysis of the data obtained by the neighbor-joining method by means of the computer program “ClustalW Version 1.74”, (c) visualization of the family tree data obtained as a family tree using a suitable presentation program, wherein the variant branches off along the distance from the end up to the first branching point from the branch of the family tree on the end of which SIM27 is located.
 3. The GAG protein of SIM27 or a variant thereof, which is investigated by the following process: (a) extraction of the GAG portions of the sequences mentioned in Table 11 from the gene database “Genbank” and loading of the corresponding amino acid sequences into the computer program “ClustalW Version 1.74”, (b) multiple alignment of these amino acid sequences with the sequence according to Table 8 and phylogenetic analysis of the data obtained by the neighbor-joining method by means of the computer program “ClustalW Version 1.74”, (c) visualization of the family tree data obtained as a family tree using a suitable presentation program; wherein the variant branches off along the distance from the end up to the first branching point from the branch of the family tree on the end of which SIM27-gag is located.
 4. The Env protein of SIM27 or a variant thereof, which is investigated by the following process: (a) extraction of the ENV portions of the sequences mentioned in Table 11 from the gene database “Genbank” and loading of the corresponding amino acid sequences into the computer program “ClustalW Version 1.74”, (b) multiple alignment of these amino acid sequences with the sequences according to Table 10 and phylogenetic analysis of the data obtained by the neighbor-joining method by means of the computer program “ClustalW Version 1.74”, (c) visualization of the family tree data obtained as a family tree using a suitable presentation program; wherein the variant branches off along the distance from the end up to the first branching point from the branch of the family tree on the end of which SIM27-env is located.
 5. The POL protein of SIM27 or a variant thereof, which is investigated by the following process: (a) extraction of the POL portions of the sequences mentioned in Table 11 from the gene database “Genbank” and loading of the corresponding amino acid sequences into the computer program “ClustalW Version 1.74”, (b) multiple alignment of these amino acid sequences with the sequences according to Table 9 and phylogenetic analysis of the data obtained by the neighbor-joining method by means of the computer program “ClustalW Version 1.74”, (c) visualization of the family tree data obtained as a family tree using a suitable presentation program; wherein the variant branches off along the distance from the end up to the first branching point from the branch of the family tree on the end of which SIM27-pol is located.
 6. The use of a virus as claimed in claim 1 or of a protein as claimed in one or more of claims 3 to 5 for the detection of antibodies directed against an immunodeficiency virus in a sample. 